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Key Terms 
This module introduces a number of key terms related to statistical 
sampling and data. 


In statistics, we generally want to study a population. You can think of a 
population as an entire collection of persons, things, or objects under study. 
To study the larger population, we select a sample. The idea of sampling is 
to select a portion (or subset) of the larger population and study that portion 
(the sample) to gain information about the population. Data are the result of 
sampling from a population. 


Because it takes a lot of time and money to examine an entire population, 
sampling is a very practical technique. If you wished to compute the overall 
grade point average at your school, it would make sense to select a sample 
of students who attend the school. The data collected from the sample 
would be the students' grade point averages. In presidential elections, 
opinion poll samples of 1,000 to 2,000 people are taken. The opinion poll is 
supposed to represent the views of the people in the entire country. 
Manufacturers of canned carbonated drinks take samples to determine if a 
16 ounce can contains 16 ounces of carbonated drink. 


From the sample data, we can calculate a statistic. A statistic is a number 
that is a property of the sample. For example, if we consider one math class 
to be a sample of the population of all math classes, then the average 
number of points earned by students in that one math class at the end of the 
term is an example of a statistic. The statistic is an estimate of a population 
parameter. A parameter is a number that is a property of the population. 
Since we considered all math classes to be the population, then the average 
number of points earned per student over all the math classes is an example 
of a parameter. 


One of the main concerns in the field of statistics is how accurately a 
Statistic estimates a parameter. The accuracy really depends on how well the 
sample represents the population. The sample must contain the 
characteristics of the population in order to be a representative sample. We 
are interested in both the sample statistic and the population parameter in 
inferential statistics. In a later chapter, we will use the sample statistic to 
test the validity of the established population parameter. 


A variable, notated by capital letters like X and Y, is a characteristic of 
interest for each person or thing in a population. Variables may be 
numerical or categorical. Numerical variables take on values with equal 
units such as weight in pounds and time in hours. Categorical variables 
place the person or thing into a category. If we let X equal the number of 
points earned by one math student at the end of a term, then X isa 
numerical variable. If we let Y be a person's party affiliation, then examples 
of Y include Republican, Democrat, and Independent. Y is a categorical 
variable. We could do some math with values of X (calculate the average 
number of points earned, for example), but it makes no sense to do math 
with values of Y (calculating an average party affiliation makes no sense). 


Data are the actual values of the variable. They may be numbers or they 
may be words. Datum is a single value. 


Two words that come up often in statistics are mean and proportion. If you 
were to take three exams in your math classes and obtained scores of 86, 
75, and 92, you calculate your mean score by adding the three exam scores 
and dividing by three (your mean score would be 84.3 to one decimal 
place). If, in your math class, there are 40 students and 22 are men and 18 


are women, then the proportion of men students is a. and the proportion of 


women students is 3 Mean and proportion are discussed in more detail in 


later chapters. 


Note: 

Mean and Average 

The words "mean" and "average" are often used interchangeably. The 
substitution of one word for the other is common practice. The technical 
term is "arithmetic mean" and "average" is technically a center location. 
However, in practice among non-statisticians, "average" is commonly 
accepted for "arithmetic mean." 


Example: 


Exercise: 


Problem: 


Define the key terms from the following study: We want to know the 
average (mean) amount of money first year college students spend at 
ABC College on school supplies that do not include books. We 
randomly survey 100 first year students at the college. Three of those 
students spent $150, $200, and $225, respectively. 


Solution: 


The population is all first year students attending ABC College this 
term. 


The sample could be all students enrolled in one section of a 
beginning statistics course at ABC College (although this sample may 
not represent the entire population). 


The parameter is the average (mean) amount of money spent 
(excluding books) by first year college students at ABC College this 
term. 


The statistic is the average (mean) amount of money spent (excluding 
books) by first year college students in the sample. 


The variable could be the amount of money spent (excluding books) 
by one first year student. Let X = the amount of money spent 
(excluding books) by one first year student attending ABC College. 


The data are the dollar amounts spent by the first year students. 
Examples of the data are $150, $200, and $225. 


Optional Collaborative Classroom Exercise 


Do the following exercise collaboratively with up to four people per group. 
Find a population, a sample, the parameter, the statistic, a variable, and data 
for the following study: You want to determine the average (mean) number 
of glasses of milk college students drink per day. Suppose yesterday, in your 
English class, you asked five students how many glasses of milk they drank 
the day before. The answers were 1, 0, 1, 3, and 4 glasses of milk. 


Glossary 


Average 
A number that describes the central tendency of the data. There are a 
number of specialized averages, including the arithmetic mean, 
weighted mean, median, mode, and geometric mean. 


Data 
A set of observations (a set of possible outcomes). Most data can be 
put into two groups: qualitative (hair color, ethnic groups and other 
attributes of the population) and quantitative (distance traveled to 
college, number of children in a family, etc.). Quantitative data can be 
separated into two subgroups: discrete and continuous. Data is 
discrete if it is the result of counting (the number of students of a given 
ethnic group in a class, the number of books on a shelf, etc.). Data is 
continuous if it is the result of measuring (distance traveled, weight of 
luggage, etc.) 


Proportion 


e Asa number: A proportion is the number of successes divided by 
the total number in the sample. 

e Asa probability distribution: Given a binomial random variable 
(RV), X ~B(n, p), consider the ratio of the number X of 
successes in n Bernouli trials to the number n of trials. P/= ~. 
This new RV is called a proportion, and if the number of trials, n, 
is large enough, P’ ~N(p, ®*). 


Data 

This module introduces the concepts of qualitative data, quantitative 
continuous data, and quantitative discrete data as used in statistics. Sample 
problems are included. 


Data may come from a population or from a sample. Small letters like x or 
y generally are used to represent data values. Most data can be put into the 
following categories: 


e¢ Qualitative 
¢ Quantitative 


Qualitative data are the result of categorizing or describing attributes of a 
population. Hair color, blood type, ethnic group, the car a person drives, and 
the street a person lives on are examples of qualitative data. Qualitative data 
are generally described by words or letters. For instance, hair color might 
be black, dark brown, light brown, blonde, gray, or red. Blood type might 
be AB+, O-, or B+. Researchers often prefer to use quantitative data over 
qualitative data because it lends itself more easily to mathematical analysis. 
For example, it does not make sense to find an average hair color or blood 


type. 


Quantitative data are always numbers. Quantitative data are the result of 
counting or measuring attributes of a population. Amount of money, pulse 
rate, weight, number of people living in your town, and the number of 
students who take statistics are examples of quantitative data. Quantitative 
data may be either discrete or continuous. 


All data that are the result of counting are called quantitative discrete 
data. These data take on only certain numerical values. If you count the 
number of phone calls you receive for each day of the week, you might get 
0, 1, 2, 3, etc. 


All data that are the result of measuring are quantitative continuous data 
assuming that we can measure accurately. Measuring angles in radians 
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might result in the numbers etc. If you and your friends 


backpacks are discrete data and the weights of the backpacks are continuous 
data. 


Note:In this course, the data used is mainly quantitative. It is easy to 
calculate statistics (like the mean or proportion) from numbers. In the 
chapter Descriptive Statistics, you will be introduced to stem plots, 
histograms and box plots all of which display quantitative data. Qualitative 
data is discussed at the end of this section through graphs. 


Example: 

Data Sample of Quantitative Discrete Data 

The data are the number of books students carry in their backpacks. You 
sample five students. Two students carry 3 books, one student carries 4 
books, one student carries 2 books, and one student carries 1 book. The 
numbers of books (3, 4, 2, and 1) are the quantitative discrete data. 


Example: 

Data Sample of Quantitative Continuous Data 

The data are the weights of the backpacks with the books in it. You sample 
the same five students. The weights (in pounds) of their backpacks are 6.2, 
7, 6.8, 9.1, 4.3. Notice that backpacks carrying three books can have 
different weights. Weights are quantitative continuous data because 
weights are measured. 


Example: 

Data Sample of Qualitative Data 

The data are the colors of backpacks. Again, you sample the same five 
students. One student has a red backpack, two students have black 
backpacks, one student has a green backpack, and one student has a gray 
backpack. The colors red, black, black, green, and gray are qualitative data. 


Note: You may collect data as numbers and report it categorically. For 
example, the quiz scores for each student are recorded throughout the term. 
At the end of the term, the quiz scores are reported as A, B, C, D, or F. 


Example: 
Exercise: 


Problem: 


Work collaboratively to determine the correct data type (quantitative 
or qualitative). Indicate whether quantitative data are continuous or 
discrete. Hint: Data that are discrete often start with the words "the 
number of." 
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. The number of pairs of shoes you own. 

. The type of car you drive. 

. Where you go on vacation. 

. The distance it is from your home to the nearest grocery store. 


The number of classes you take per school year. 


. The tuition for your classes 


The type of calculator you use. 


. Movie ratings. 

. Political party preferences. 

. Weight of sumo wrestlers. 

. Amount of money won playing poker. 

. Number of correct answers on a quiz. 

. Peoples’ attitudes toward the government. 

. IQ scores. (This may cause some discussion. ) 


Solution: 


Items 1, 5, 11, and 12 are quantitative discrete; items 4, 6, 10, and 14 
are quantitative continuous; and items 2, 3, 7, 8, 9, and 13 are 
qualitative. 


Qualitative Data Discussion 

Below are tables of part-time vs full-time students at De Anza College in 
Cupertino, CA and Foothill College in Los Altos, CA for the Spring 2010 
quarter. The tables display counts (frequencies) and percentages or 
proportions (relative frequencies). The percent columns make comparing 
the same categories in the colleges easier. Displaying percentages along 
with the numbers is often helpful, but it is particularly important when 
comparing sets of data that do not have the same totals, such as the total 
enrollments for both colleges in this example. Notice how much larger the 
percentage for part-time students at Foothill College is compared to De 
Anza College. 


Number Percent 
Full-time 9,200 40.9% 
Part-time 13,296 59.1% 
Total 22,496 100% 
De Anza College 
Number Percent 
Full-time 4,059 28.6% 


Part-time 10,124 71.4% 


Total 14,183 100% 
Foothill College 


Tables are a good way of organizing and displaying data. But graphs can be 
even more helpful in understanding the data. There are no strict rules 
concerning what graphs to use. Below are pie charts and bar graphs, two 
graphs that are used to display qualitative data. 


In a pie chart, categories of data are represented by wedges in the circle 
and are proportional in size to the percent of individuals in each category. 


In a bar graph, the length of the bar for each category is proportional to the 
number or percent of individuals in each category. Bars may be vertical or 
horizontal. 


A Pareto chart consists of bars that are sorted into order by category size 
(largest to smallest). 


Look at the graphs and determine which graph (pie or bar) you think 
displays the comparisons better. This is a matter of preference. 


It is a good idea to look at a variety of graphs to see which is the most 
helpful in displaying the data. We might make different choices of what we 
think is the "best" graph depending on the data and the context. Our choice 
also depends on what we are using the data for. 


De Anza College 


@ Full Time o Part Time 


Student Status 
Pat Tire Pat Time 


Pert Tire 


De Anza Foothill 


Foothill College 


@ Full Time & Part Time 


Full Time 
28.6% 


Part Time 
71.4% 


Percentages That Add to More (or Less) Than 100% 

Sometimes percentages add up to be more than 100% (or less than 100%). 
In the graph, the percentages add to more than 100% because students can 
be in more than one category. A bar graph is appropriate to compare the 
relative size of the categories. A pie chart cannot be used. It also could not 
be used if the percentages added to less than 100%. 


Characteristic/Category Percent 


Full-time Students 40.9% 
Students who intend to transfer to a 4-year educational 48.6% 
institution 

Students under age 25 61.0% 
TOTAL 150.5% 


De Anza College Spring 2010 


Omitting Categories/Missing Data 

The table displays Ethnicity of Students but is missing the 
"Other/Unknown" category. This category contains people who did not feel 
they fit into any of the ethnicity categories or declined to respond. Notice 
that the frequencies do not add up to the total number of students. Create a 
bar graph and not a pie chart. 


Frequency Percent 


Asian 8,794 36.1% 

Black 1,412 5.8% 

Filipino 1,298 5.3% 

Hispanic 4,180 17.1% 

Native American 146 0.6% 

Pacific Islander 236 1.0% 

White 5,978 24.5% 

TOTAL 22,044 out of 24,382 90.4% out of 100% 


Missing Data: Ethnicity of Students De Anza College Fall Term 2007 
(Census Day) 


E thnicity of Students 


40.0% 


35.0% 
20.0% 
25.0% 
20.0% 
15.0% 
10.0% 

5.0% 


°* Asian Black Filipino Hispanic Native Pacific White 
Amer. Islander 
36.1% 58% 53% 17.1% 06% 1.0% 


Bar graph Without Other/Unknown Category 


The following graph is the same as the previous graph but the 
"Other/Unknown" percent (9.6%) has been added back in. The 
"Other/Unknown" category is large compared to some of the other 
categories (Native American, 0.6%, Pacific Islander 1.0% particularly). 
This is important to know when we think about what the data are telling us. 


This particular bar graph can be hard to understand visually. The graph 
below it is a Pareto chart. The Pareto chart has the bars sorted from largest 
to smallest and is easier to read and interpret. 


Ethnicity of Students 


Bar Graph With Other/Unknown Category 


Pareto Chart With Bars Sorted By Size 


Pie Charts: No Missing Data 

The following pie charts have the "Other/Unknown" category added back in 
(since the percentages must add to 100%). The chart on the right is 
organized having the wedges by size and makes for a more visually 
informative graph than the unsorted, alphabetical graph on the left. 


Ethnicity of Students Ethnicity of Students 


Pacific Islander 


Glossary 


Continuous Random Variable 
A random variable (RV) whose outcomes are measured. 


Example: 
The height of trees in the forest is a continuous RV. 


Data 
A set of observations (a set of possible outcomes). Most data can be 
put into two groups: qualitative (hair color, ethnic groups and other 
attributes of the population) and quantitative (distance traveled to 
college, number of children in a family, etc.). Quantitative data can be 
separated into two subgroups: discrete and continuous. Data is 
discrete if it is the result of counting (the number of students of a given 
ethnic group in a class, the number of books on a shelf, etc.). Data is 
continuous if it is the result of measuring (distance traveled, weight of 
luggage, etc.) 


Discrete Random Variable 
A random variable (RV) whose outcomes are counted. 


Qualitative Data 
See Data. 


Quantitative Data 
See Data. 


Sampling 

This module introduces the concept of statistical sampling. Students are 
taught the difference between a simple random sample, stratified sample, 
cluster sample, systematic sample, and convenience sample. Example 
problems are provided, including an optional classroom activity. 


Gathering information about an entire population often costs too much or is 
virtually impossible. Instead, we use a sample of the population. A sample 
should have the same characteristics as the population it is 
representing. Most statisticians use various methods of random sampling 
in an attempt to achieve this goal. This section will describe a few of the 
most common methods. 


There are several different methods of random sampling. In each form of 
random sampling, each member of a population initially has an equal 
chance of being selected for the sample. Each method has pros and cons. 
The easiest method to describe is called a simple random sample. Any 
group of n individuals is equally likely to be chosen by any other group of n 
individuals if the simple random sampling technique is used. In other 
words, each sample of the same size has an equal chance of being selected. 
For example, suppose Lisa wants to form a four-person study group (herself 
and three other people) from her pre-calculus class, which has 31 members 
not including Lisa. To choose a simple random sample of size 3 from the 
other members of her class, Lisa could put all 31 names in a hat, shake the 
hat, close her eyes, and pick out 3 names. A more technological way is for 
Lisa to first list the last names of the members of her class together with a 
two-digit number as shown below. 


ID Name 


00 Anselmo 


ID 


01 


02 


03 


04 


05 


06 


07 


08 


09 


10 


11 


12 


13 


14 


15 


16 


17 


Name 
Bautista 
Bayani 
Cheng 
Cuarismo 
Cuningham 
Fontecha 
Hong 
Hoobler 
Jiao 

Khan 
King 
Legeny 
Lundquist 
Macierz 
Motogawa 
Okimoto 


Patel 


ID 


18 


19 


20 


21 


22 


23 


24 


25 


26 


27. 


28 


29 


30 


Class Roster 


Name 
Price 
Quizon 
Reyes 
Roquero 
Roth 
Rowell 
Salangsang 
Slade 
Stracher 
Tallai 
Tran 
Wai 


Wood 


Lisa can either use a table of random numbers (found in many statistics 
books as well as mathematical handbooks) or a calculator or computer to 
generate random numbers. For this example, suppose Lisa chooses to 
generate random numbers from a calculator. The numbers generated are: 


94360 .99832 .14669 .51470 .40581 .73381 .04399 


Lisa reads two-digit groups until she has chosen three class members (that 
is, she reads .94360 as the groups 94, 43, 36, 60). Each random number 
may only contribute one class member. If she needed to, Lisa could have 
generated more random numbers. 


The random numbers .94360 and .99832 do not contain appropriate two 
digit numbers. However the third random number, .14669, contains 14 (the 
fourth random number also contains 14), the fifth random number contains 
05, and the seventh random number contains 04. The two-digit number 14 
corresponds to Macierz, 05 corresponds to Cunningham, and 04 
corresponds to Cuarismo. Besides herself, Lisa's group will consist of 
Marcierz, and Cunningham, and Cuarismo. 


Besides simple random sampling, there are other forms of sampling that 
involve a chance process for getting the sample. Other well-known 
random sampling methods are the stratified sample, the cluster sample, 
and the systematic sample. 


To choose a stratified sample, divide the population into groups called 
strata and then take a proportionate number from each stratum. For 
example, you could stratify (group) your college population by department 
and then choose a proportionate simple random sample from each stratum 
(each department) to get a stratified random sample. To choose a simple 
random sample from each department, number each member of the first 
department, number each member of the second department and do the 
same for the remaining departments. Then use simple random sampling to 
choose proportionate numbers from the first department and do the same for 
each of the remaining departments. Those numbers picked from the first 
department, picked from the second department and so on represent the 
members who make up the stratified sample. 


To choose a cluster sample, divide the population into clusters (groups) 
and then randomly select some of the clusters. All the members from these 
clusters are in the cluster sample. For example, if you randomly sample four 
departments from your college population, the four departments make up 
the cluster sample. For example, divide your college faculty by department. 
The departments are the clusters. Number each department and then choose 


four different numbers using simple random sampling. All members of the 
four departments with those numbers are the cluster sample. 


To choose a systematic sample, randomly select a starting point and take 
every nth piece of data from a listing of the population. For example, 
suppose you have to do a phone survey. Your phone book contains 20,000 
residence listings. You must choose 400 names for the sample. Number the 
population 1 - 20,000 and then use a simple random sample to pick a 
number that represents the first name of the sample. Then choose every 
50th name thereafter until you have a total of 400 names (you might have to 
go back to the of your phone list). Systematic sampling is frequently chosen 
because it is a simple method. 


A type of sampling that is nonrandom is convenience sampling. 
Convenience sampling involves using results that are readily available. For 
example, a computer software store conducts a marketing study by 
interviewing potential customers who happen to be in the store browsing 
through the available software. The results of convenience sampling may be 
very good in some cases and highly biased (favors certain outcomes) in 
others. 


Sampling data should be done very carefully. Collecting data carelessly can 
have devastating results. Surveys mailed to households and then returned 
may be very biased (for example, they may favor a certain group). It is 
better for the person conducting the survey to select the sample 
respondents. 


True random sampling is done with replacement. That is, once a member 
is picked that member goes back into the population and thus may be 
chosen more than once. However for practical reasons, in most populations, 
simple random sampling is done without replacement. Surveys are 
typically done without replacement. That is, a member of the population 
may be chosen only once. Most samples are taken from large populations 
and the sample tends to be small in comparison to the population. Since this 
is the case, sampling without replacement is approximately the same as 
sampling with replacement because the chance of picking the same 
individual more than once using with replacement is very low. 


For example, in a college population of 10,000 people, suppose you want to 
randomly pick a sample of 1000 for a survey. For any particular sample 
of 1000, if you are sampling with replacement, 


e the chance of picking the first person is 1000 out of 10,000 (0.1000); 

e the chance of picking a different second person for this sample is 999 
out of 10,000 (0.0999); 

e the chance of picking the same person again is 1 out of 10,000 (very 
low). 


If you are sampling without replacement, 


e the chance of picking the first person for any particular sample is 1000 
out of 10,000 (0.1000); 

e the chance of picking a different second person is 999 out of 9,999 
(0.0999); 

e you do not replace the first person before picking the next person. 


Compare the fractions 999/10,000 and 999/9,999. For accuracy, carry the 
decimal answers to 4 place decimals. To 4 decimal places, these numbers 
are equivalent (0.0999). 


Sampling without replacement instead of sampling with replacement only 
becomes a mathematics issue when the population is small which is not that 
common. For example, if the population is 25 people, the sample is 10 and 
you are sampling with replacement for any particular sample, 


e the chance of picking the first person is 10 out of 25 and a different 
second person is 9 out of 25 (you replace the first person). 


If you sample without replacement, 


e the chance of picking the first person is 10 out of 25 and then the 
second person (which is different) is 9 out of 24 (you do not replace 
the first person). 


Compare the fractions 9/25 and 9/24. To 4 decimal places, 9/25 = 0.3600 
and 9/24 = 0.3750. To 4 decimal places, these numbers are not equivalent. 


When you analyze data, it is important to be aware of sampling errors and 
nonsampling errors. The actual process of sampling causes sampling errors. 
For example, the sample may not be large enough. Factors not related to the 
sampling process cause nonsampling errors. A defective counting device 
can cause a nonsampling error. 


In reality, a sample will never be exactly representative of the population so 
there will always be some sampling error. As a rule, the larger the sample, 
the smaller the sampling error. 


In statistics, a sampling bias is created when a sample is collected from a 
population and some members of the population are not as likely to be 
chosen as others (remember, each member of the population should have an 
equally likely chance of being chosen). When a sampling bias happens, 
there can be incorrect conclusions drawn about the population that is being 
studied. 


Example: 
Exercise: 


Problem: 


Determine the type of sampling used (simple random, stratified, 
systematic, cluster, or convenience). 


1. A soccer coach selects 6 players from a group of boys aged 8 to 
10, 7 players from a group of boys aged 11 to 12, and 3 players 
from a group of boys aged 13 to 14 to form a recreational soccer 
team. 

2. A pollster interviews all human resource personnel in five 
different high tech companies. 

3. A high school educational researcher interviews 50 high school 
female teachers and 50 high school male teachers. 

4. A medical researcher interviews every third cancer patient from 
a list of cancer patients at a local hospital. 


5. A high school counselor uses a computer to generate 50 random 
numbers and then picks students whose names correspond to the 
numbers. 

6. A student interviews classmates in his algebra class to determine 
how many pairs of jeans a student owns, on the average. 


Solution: 


. stratified 

. cluster 

. stratified 

. systematic 

. simple random 
. convenience 
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If we were to examine two samples representing the same population, even 
if we used random sampling methods for the samples, they would not be 
exactly the same. Just as there is variation in data, there is variation in 
samples. As you become accustomed to sampling, the variability will seem 
natural. 


Example: 

Suppose ABC College has 10,000 part-time students (the population). We 
are interested in the average amount of money a part-time student spends 
on books in the fall term. Asking all 10,000 students is an almost 
impossible task. 

Suppose we take two different samples. 

First, we use convenience sampling and survey 10 students from a first 
term organic chemistry class. Many of these students are taking first term 
calculus in addition to the organic chemistry class . The amount of money 
they spend is as follows: 

$128 $87 $173 $116 $130 $204 $147 $189 $93 $153 


The second sample is taken by using a list from the P.E. department of 
senior citizens who take P.E. classes and taking every 5th senior citizen on 
the list, for a total of 10 senior citizens. They spend: 

$50 $40 $36 $15 $50 $100 $40 $53 $22 $22 

Exercise: 


Problem: 


Do you think that either of these samples is representative of (or is 
characteristic of) the entire 10,000 part-time student population? 


Solution: 


No. The first sample probably consists of science-oriented students. 
Besides the chemistry course, some of them are taking first-term 
calculus. Books for these classes tend to be expensive. Most of these 
students are, more than likely, paying more than the average part-time 
student for their books. The second sample is a group of senior 
citizens who are, more than likely, taking courses for health and 
interest. The amount of money they spend on books is probably much 
less than the average part-time student. Both samples are biased. Also, 
in both cases, not all students have a chance to be in either sample. 


Exercise: 
Problem: 


Since these samples are not representative of the entire population, is 
it wise to use the results to describe the entire population? 


Solution: 


No. For these samples, each member of the population did not have an 
equally likely chance of being chosen. 


Now, suppose we take a third sample. We choose ten different part-time 
students from the disciplines of chemistry, math, English, psychology, 
sociology, history, nursing, physical education, art, and early childhood 
development. (We assume that these are the only disciplines in which part- 
time students at ABC College are enrolled and that an equal number of 


part-time students are enrolled in each of the disciplines.) Each student is 
chosen using simple random sampling. Using a calculator, random 
numbers are generated and a student from a particular discipline is selected 
if he/she has a corresponding number. The students spend: 

$180 $50 $150 $85 $260 $75 $180 $200 $200 $150 

Exercise: 


Problem: Is the sample biased? 
Solution: 


The sample is unbiased, but a larger sample would be recommended 
to increase the likelihood that the sample will be close to 
representative of the population. However, for a biased sampling 
technique, even a large sample runs the risk of not being 
representative of the population. 


Students often ask if it is "good enough" to take a sample, instead of 
surveying the entire population. If the survey is done well, the answer is 
yes. 


Optional Collaborative Classroom Exercise 


Exercise: 


Problem: 


As a class, determine whether or not the following samples are 
representative. If they are not, discuss the reasons. 


1. To find the average GPA of all students in a university, use all 
honor students at the university as the sample. 

2. To find out the most popular cereal among young people under 
the age of 10, stand outside a large supermarket for three hours 
and speak to every 20th child under age 10 who enters the 
supermarket. 


3. To find the average annual income of all adults in the United 
States, sample U.S. congressmen. Create a cluster sample by 
considering each state as a stratum (group). By using simple 
random sampling, select states to be part of the cluster. Then 
survey every U.S. congressman in the cluster. 

4. To determine the proportion of people taking public transportation 
to work, survey 20 people in New York City. Conduct the survey 
by sitting in Central Park on a bench and interviewing every 
person who sits next to you. 

5. To determine the average cost of a two day stay in a hospital in 
Massachusetts, survey 100 hospitals across the state using simple 
random sampling. 


Variation 

This module discusses statistical variability within data and samples. 
Students will be given the opportunity to see this variability in action 
through participation in an optional classroom exercise. This module also 
has a section that discusses Critical Evaluation. 


Variation in Data 


Variation is present in any set of data. For example, 16-ounce cans of 
beverage may contain more or less than 16 ounces of liquid. In one study, 
eight 16 ounce cans were measured and produced the following amount (in 
ounces) of beverage: 


15.6.16.1 15.2 14.8 15.8 15.9 16.0 15.5 


Measurements of the amount of beverage in a 16-ounce can may vary 
because different people make the measurements or because the exact 
amount, 16 ounces of liquid, was not put into the cans. Manufacturers 
regularly run tests to determine if the amount of beverage in a 16-ounce can 
falls within the desired range. 


Be aware that as you take data, your data may vary somewhat from the data 
someone else is taking for the same purpose. This is completely natural. 
However, if two or more of you are taking the same data and get very 
different results, it is time for you and the others to reevaluate your data- 
taking methods and your accuracy. 


Variation in Samples 


It was mentioned previously that two or more samples from the same 
population, taken randomly, and having close to the same characteristics of 
the population are different from each other. Suppose Doreen and Jung both 
decide to study the average amount of time students at their college sleep 
each night. Doreen and Jung each take samples of 500 students. Doreen 
uses systematic sampling and Jung uses cluster sampling. Doreen's sample 
will be different from Jung's sample. Even if Doreen and Jung used the 


same sampling method, in all likelihood their samples would be different. 
Neither would be wrong, however. 


Think about what contributes to making Doreen's and Jung's samples 
different. 


If Doreen and Jung took larger samples (i.e. the number of data values is 
increased), their sample results (the average amount of time a student 
Sleeps) might be closer to the actual population average. But still, their 
samples would be, in all likelihood, different from each other. This 
variability in samples cannot be stressed enough. 


Size of a Sample 


The size of a sample (often called the number of observations) is important. 
The examples you have seen in this book so far have been small. Samples 
of only a few hundred observations, or even smaller, are sufficient for many 
purposes. In polling, samples that are from 1200 to 1500 observations are 
considered large enough and good enough if the survey is random and is 
well done. You will learn why when you study confidence intervals. 


Be aware that many large samples are biased. For example, call-in surveys 
are invariable biased because people choose to respond or not. 


Optional Collaborative Classroom Exercise 


Exercise: 


Problem: 


Divide into groups of two, three, or four. Your instructor will give each 
group one 6-sided die. Try this experiment twice. Roll one fair die (6- 
sided) 20 times. Record the number of ones, twos, threes, fours, fives, 
and sixes you get below ("frequency" is the number of times a 
particular face of the die occurs): 


Face on Die Frequency 


i) 


rs) 
6 


First Experiment (20 rolls) 


Face on Die Frequency 


i. 


rs) 
6 


Second Experiment (20 rolls) 


Did the two experiments have the same results? Probably not. If you 
did the experiment a third time, do you expect the results to be 
identical to the first or second experiment? (Answer yes or no.) Why 
or why not? 


Which experiment had the correct results? They both did. The job of 
the statistician is to see through the variability and draw appropriate 
conclusions. 


Critical Evaluation 


We need to critically evaluate the statistical studies we read about and 
analyze before accepting the results of the study. Common problems to be 
aware of include 


e Problems with Samples: A sample should be representative of the 
population. A sample that is not representative of the population is 
biased. Biased samples that are not representative of the population 
give results that are inaccurate and not valid. 

e Self-Selected Samples: Responses only by people who choose to 
respond, such as call-in surveys are often unreliable. 

e Sample Size Issues: Samples that are too small may be unreliable. 
Larger samples are better if possible. In some situations, small samples 
are unavoidable and can still be used to draw conclusions, even though 
larger samples are better. Examples: Crash testing cars, medical testing 
for rare conditions. 

e Undue influence: Collecting data or asking questions in a way that 
influences the response. 

e Non-response or refusal of subject to participate: The collected 
responses may no longer be representative of the population. Often, 
people with strong positive or negative opinions may answer surveys, 
which can affect the results. 

e Causality: A relationship between two variables does not mean that 
one causes the other to occur. They may both be related (correlated) 
because of their relationship through a different variable. 

e Self-Funded or Self-Interest Studies: A study performed by a person or 
organization in order to support their claim. Is the study impartial? 


Read the study carefully to evaluate the work. Do not automatically 
assume that the study is good but do not automatically assume the 
study is bad either. Evaluate it on its merits and the work done. 

e Misleading Use of Data: Improperly displayed graphs, incomplete 
data, lack of context. 

e Confounding: When the effects of multiple factors on a response 
cannot be separated. Confounding makes it difficult or impossible to 
draw valid conclusions about the effect of each factor. 


Glossary 


Population 
The collection, or set, of all individuals, objects, or measurements 
whose properties are being studied. 


Sample 
A portion of the population understudy. A sample is representative if it 
characterizes the population being studied. 


Answers and Rounding Off 
This module briefly explains the correct way to round off answers when 
working with statistical data. 


A simple way to round off answers is to carry your final answer one more 
decimal place than was present in the original data. Round only the final 
answer. Do not round any intermediate results, if possible. If it becomes 
necessary to round intermediate results, carry them to at least twice as many 
decimal places as the final answer. For example, the average of the three 
quiz scores 4, 6, 9 is 6.3, rounded to the nearest tenth, because the data are 
whole numbers. Most answers will be rounded in this manner. 


It is not necessary to reduce most fractions in this course. Especially in 
Probability Topics, the chapter on probability, it is more helpful to leave an 
answer as an unreduced fraction. 


Frequency 

This module introduces the concepts of frequency, relative frequency, and 
cumulative relative frequency, and the relationship between these measures. 
Students will have the opportunity to interpret data through the sample problems 
provided. 


Twenty students were asked how many hours they worked per day. Their 
responses, in hours, are listed below: 


96332475235654435253 


Below is a frequency table listing the different data values in ascending order and 
their frequencies. 


DATA VALUE FREQUENCY 
2 3 
3 fs) 
4 3 
fs) 6 
6 2 
7 1 


Frequency Table of Student Work Hours 


A frequency is the number of times a given datum occurs in a data set. 
According to the table above, there are three students who work 2 hours, five 
students who work 3 hours, etc. The total of the frequency column, 20, represents 
the total number of students included in the sample. 


A relative frequency is the fraction or proportion of times an answer occurs. To 
find the relative frequencies, divide each frequency by the total number of 
students in the sample - in this case, 20. Relative frequencies can be written as 
fractions, percents, or decimals. 


DATA VALUE FREQUENCY RELATIVE FREQUENCY 


2 3 +. or 0.15 
3 5 2 or 0.25 
A 3 x or 0.15 
5 6 3; or 0.30 
6 2 + or 0.10 
7 1 $a or 0.05 


Frequency Table of Student Work Hours w/ Relative Frequency 


The sum of the relative frequency column is an or 1. 


Cumulative relative frequency is the accumulation of the previous relative 
frequencies. To find the cumulative relative frequencies, add all the previous 
relative frequencies to the relative frequency for the current row. 


CUMULATIVE 


DATA RELATIVE RELATIVE 
VALUE FREQUENCY FREQUENCY FREQUENCY 
2 3 3 or 0.15 0.15 

5 0.15 + 0.25 = 
3 i) 59 OF 0.25 0.40 

as 0.40 + 0.15 = 
4 3 50 OF 0.15 0.55 

6 0.55 + 0.30 = 
5 6 59 OF 0.30 0.85 

2 0.85 + 0.10 = 
6 2 30 OF 0.10 0.95 

ot) 0.95 + 0.05 = 
7 1 59 OF 0.05 1.00 


Frequency Table of Student Work Hours w/ Relative and Cumulative Relative 
Frequency 


The last entry of the cumulative relative frequency column is one, indicating that 
one hundred percent of the data has been accumulated. 


Note:Because of rounding, the relative frequency column may not always sum 
to one and the last entry in the cumulative relative frequency column may not be 
one. However, they each should be close to one. 


The following table represents the heights, in inches, of a sample of 100 male 
semiprofessional soccer players. 


HEIGHTS 
(INCHES) 


59.95 - 
61.95 


61.95 - 
63.95 


63.95 - 
65.95 


65.95 - 
67.95 


67.95 - 
69.95 


69.95 - 
71.95 


71.95 - 
7390 


73.95 - 
75.95 


FREQUENCY 


15 


40 


17 


12 


Total = 100 


59.95 - 61.95 inches 
61.95 - 63.95 inches 
63.95 - 65.95 inches 
65.95 - 67.95 inches 
67.95 - 69.95 inches 


RELATIVE 
FREQUENCY 
<a5 = 0.05 

=35 = 0.03 

G60 = 0-15 

2. = 0.40 

ay = 0.17 

Fay = 0.12 

aie = 0.07 

500 = 0.01 
Total = 1.00 


Frequency Table of Soccer Player Height 


CUMULATIVE 
RELATIVE 
FREQUENCY 


0.05 


0.05 + 0.03 = 
0.08 


0.08 + 0.15 = 
0.23 


0.23 + 0.40 = 
0.63 


0.63 + 0.17 = 
0.80 


0.80 + 0.12 = 
0.92 


0.92 + 0.07 = 
0.99 


0.99 + 0.01 = 
1.00 


The data in this table has been grouped into the following intervals: 


e 69.95 - 71.95 inches 
e 71.95 - 73.95 inches 
e 73.95 - 75.95 inches 


Note: This example is used again in the Descriptive Statistics chapter, where the 
method used to compute the intervals will be explained. 


In this sample, there are 5 players whose heights are between 59.95 - 61.95 
inches, 3 players whose heights fall within the interval 61.95 - 63.95 inches, 15 
players whose heights fall within the interval 63.95 - 65.95 inches, 40 players 
whose heights fall within the interval 65.95 - 67.95 inches, 17 players whose 
heights fall within the interval 67.95 - 69.95 inches, 12 players whose heights fall 
within the interval 69.95 - 71.95, 7 players whose height falls within the interval 
71.95 - 73.95, and 1 player whose height falls within the interval 73.95 - 75.95. 
All heights fall between the endpoints of an interval and not at the endpoints. 


Example: 
Exercise: 


Problem: 


From the table, find the percentage of heights that are less than 65.95 
inches. 


Solution: 


If you look at the first, second, and third rows, the heights are all less than 
65.95 inches. There are 5 + 3 + 15 = 23 males whose heights are less than 
65.95 inches. The percentage of heights less than 65.95 inches is then aan 
or 23%. This percentage is the cumulative relative frequency entry in the 


third row. 


Example: 


Exercise: 


Problem: 


From the table, find the percentage of heights that fall between 61.95 and 
65.95 inches. 


Solution: 


Add the relative frequencies in the second and third rows: 0.03 + 0.15 = 
0.18 or 18%. 


Example: 
Exercise: 


Problem: 


Use the table of heights of the 100 male semiprofessional soccer players. 
Fill in the blanks and check your answers. 


. The percentage of heights that are from 67.95 to 71.95 inches is: 

. The percentage of heights that are from 67.95 to 73.95 inches is: 

. The percentage of heights that are more than 65.95 inches is: 

. The number of players in the sample who are between 61.95 and 71.95 
inches tall is: 

5. What kind of data are the heights? 

6. Describe how you could gather this data (the heights) so that the data 

are characteristic of all male semiprofessional soccer players. 


BRWN Ee 


Remember, you count frequencies. To find the relative frequency, divide 
the frequency by the total number of data values. To find the cumulative 
relative frequency, add all of the previous relative frequencies to the 
relative frequency for the current row. 


Solution: 


led 76 
2. 36% 
Be 11 20 


4. 87 

5. quantitative continuous 

6. get rosters from each team and choose a simple random sample from 
each 


Optional Collaborative Classroom Exercise 


Exercise: 


Problem: 


In your class, have someone conduct a survey of the number of siblings 
(brothers and sisters) each student has. Create a frequency table. Add to it a 
relative frequency column and a cumulative relative frequency column. 
Answer the following questions: 


1. What percentage of the students in your class has 0 siblings? 
2. What percentage of the students has from 1 to 3 siblings? 
3. What percentage of the students has fewer than 3 siblings? 


Example: 

Nineteen people were asked how many miles, to the nearest mile they commute 
to work each day. The data are as follows: 

Is iene alerlsres0) vaalO list ays esuhowilerd Wise) 

The following table was produced: 


CUMULATIVE 
RELATIVE RELATIVE 
DATA FREQUENCY FREQUENCY FREQUENCY 


CUMULATIVE 
RELATIVE RELATIVE 
DATA FREQUENCY FREQUENCY FREQUENCY 


3 3 + 0.1579 
4 i in 0.2105 
5 3 4 0.1579 
7 2 4 0.2632 
10 3 = 0.4737 
12 2 A 0.7895 
13 1 ts 0.8421 
15 1 a 0.8948 
18 1 a. 0.9474 
20 1 de 1.0000 


Frequency of Commuting Distances 


Exercise: 


Problem: 


1. Is the table correct? If it is not correct, what is wrong? 

2. True or False: Three percent of the people surveyed commute 3 miles. 
If the statement is not correct, what should it be? If the table is 
incorrect, make the corrections. 

3. What fraction of the people surveyed commute 5 or 7 miles? 

4. What fraction of the people surveyed commute 12 miles or more? 
Less than 12 miles? Between 5 and 13 miles (does not include 5 and 
13 miles)? 


Solution: 


1. No. Frequency column sums to 18, not 19. Not all cumulative relative 
frequencies are correct. 

2. False. Frequency for 3 miles should be 1; for 2 miles (left out), 2. 
Cumulative relative frequency column should read: 0.1052, 0.1579, 
0.2105, 0.3684, 0.4737, 0.6316, 0.7368, 0.7895, 0.8421, 0.9474, 1. 


5 
By ioeemre 
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Glossary 
Frequency 


The number of times a value of the data occurs. 


Relative Frequency 


The ratio of the number of times a value of the data occurs in the set of all 
outcomes to the number of all outcomes. 


Cumulative Relative Frequency 


The term applies to an ordered set of observations from smallest to largest. 
The Cumulative Relative Frequency is the sum of the relative frequencies 
for all values that are less than or equal to the given value. 


Summary 

This module provides an outline/review of key concepts related to statistical 
sampling and data. 

Statistics 


¢ Deals with the collection, analysis, interpretation, and presentation of 
data 


Probability 
e Mathematical tool used to study randomness 
Key Terms 


e Population 
e Parameter 
e Sample 

e Statistic 

e Variable 

e Data 


Types of Data 
¢ Quantitative Data (a number) 


o Discrete (You count it.) 
o Continuous (You measure it.) 


¢ Qualitative Data (a category, words) 
Sampling 
¢ With Replacement: A member of the population may be chosen more 
than once 
e Without Replacement: A member of the population may be chosen 


only once 


Random Sampling 


e Each member of the population has an equal chance of being selected 
Sampling Methods 
e Random 


Simple random sample 
Stratified sample 
Cluster sample 
Systematic sample 


oO 0 0 


e Not Random 
o Convenience sample 
Frequency (freq. or f) 
e The number of times an answer occurs 
Relative Frequency (rel. freq. or RF) 


e The proportion of times an answer occurs 
e Can be interpreted as a fraction, decimal, or percent 


Cumulative Relative Frequencies (cum. rel. freq. or cum RF) 


e An accumulation of the previous relative frequencies 


Practice: Sampling and Data 

This module provides an opportunity for students to practice concepts 
related to statistical sampling and data. Given a sample data set, the student 
will practice constructing frequency tables, differentiating between key 
terms, and comparing sampling techniques. 


Student Learning Outcomes 


e The student will construct frequency tables. 
e The student will differentiate between key terms. 
e The student will compare sampling techniques. 


Given 


Studies are often done by pharmaceutical companies to determine the 
effectiveness of a treatment program. Suppose that a new AIDS antibody 
drug is currently under study. It is given to patients once the AIDS 
symptoms have revealed themselves. Of interest is the average(mean) 
length of time in months patients live once starting the treatment. Two 
researchers each follow a different set of 40 AIDS patients from the start of 
treatment until their deaths. The following data (in months) are collected. 


Researcher A3 4 11 15 16 17 22 44 37 16 14 24 25 15 26 27 33 29 35 44 
13 21 22 10 12 8 40 32 26 27 31 34 29 17 8 24 18 47 33 34 


Researcher B3 14 115 16 17 28 41 31 18 14 14 26 25 21 22 31 2 35 44 23 
2121 16 12 18 41 22 16 25 33 34 29 13 18 24 23 42 33 29 


Organize the Data 


Complete the tables below using the data provided. 


Cumulative 
Survival Length Relative Relative 
(in months) Frequency Frequency Frequency 
0.5-6.5 
6.5-12.5 
12.5 - 18.5 
18.5 - 24.5 
24.5 - 30.5 
30.5 - 36.5 
36.5 - 42.5 


42.5 - 48.5 


Researcher A 


Cumulative 
Survival Length Relative Relative 
(in months) Frequency Frequency Frequency 
0.5-6.5 
6.5-12.5 


12.5 - 18.5 


Cumulative 

Survival Length Relative Relative 

(in months) Frequency Frequency Frequency 
18.5 - 24.5 

24.5 - 30.5 

30.5 - 36.5 

36.5 - 42.5 

42.5 - 48.5 


Researcher B 


Key Terms 


Define the key terms based upon the above example for Researcher A. 
Exercise: 


Problem: Population 


Exercise: 


Problem: Sample 


Exercise: 


Problem: Parameter 


Exercise: 


Problem: Statistic 


Exercise: 


Problem: Variable 


Exercise: 


Problem: Data 


Discussion Questions 


Discuss the following questions and then answer in complete sentences. 
Exercise: 


Problem: List two reasons why the data may differ. 
Exercise: 
Problem: 
Can you tell if one researcher is correct and the other one is incorrect? 
Why? 


Exercise: 


Problem: Would you expect the data to be identical? Why or why not? 


Exercise: 


Problem: How could the researchers gather random data? 
Exercise: 
Problem: 
Suppose that the first researcher conducted his survey by randomly 
choosing one state in the nation and then randomly picking 40 patients 


from that state. What sampling method would that researcher have 
used? 


Exercise: 


Problem: 


Suppose that the second researcher conducted his survey by choosing 
AO patients he knew. What sampling method would that researcher 
have used? What concerns would you have about this data set, based 
upon the data collection method? 


Stem and Leaf Graphs (Stemplots), Line Graphs and Bar Graphs 
This module introduces the use of stem-and-leaf graphs (stemplots), line 
graphs and bar graphs for describing a set of data visually. 


One simple graph, the stem-and-leaf graph or stem plot, comes from the 
field of exploratory data analysis.It is a good choice when the data sets are 
small. To create the plot, divide each observation of data into a stem and a 
leaf. The leaf consists of a final significant digit. For example, 23 has stem 
2 and leaf 3. Four hundred thirty-two (432) has stem 43 and leaf 2. Five 
thousand four hundred thirty-two (5,432) has stem 543 and leaf 2. The 
decimal 9.3 has stem 9 and leaf 3. Write the stems in a vertical line from 
smallest the largest. Draw a vertical line to the right of the stems. Then 
write the leaves in increasing order next to their corresponding stem. 


Example: 

For Susan Dean's spring pre-calculus class, scores for the first exam were 
as follows (smallest to largest): 
334249495355556163676868696972737478808388888890929494949496 
100 


Stem Leaf 
3 3 

4 299 
5 355 


6 1378899 


Stem Leaf 


7 2348 

8 03888 

9 0244446 
10 0 


Stem-and-Leaf Diagram 


The stem plot shows that most scores fell in the 60s, 70s, 80s, and 90s. 
Eight out of the 31 scores or approximately 26% of the scores were in the 
90's or 100, a fairly high number of As. 


The stem plot is a quick way to graph and gives an exact picture of the data. 
You want to look for an overall pattern and any outliers. An outlier is an 
observation of data that does not fit the rest of the data. It is sometimes 
called an extreme value. When you graph an outlier, it will appear not to fit 
the pattern of the graph. Some outliers are due to mistakes (for example, 
writing down 50 instead of 500) while others may indicate that something 
unusual is happening. It takes some background information to explain 
outliers. In the example above, there were no outliers. 


Example: 

Create a stem plot using the data: 
1.11.52.32.52.73.23.33.33.53.84.0 4.24.54.54.74.85.55.66.56.712.3 
The data are the distance (in kilometers) from a home to the nearest 
supermarket. 

Exercise: 


Problem: 


1. Are there any values that might possibly be outliers? 
2. Do the data seem to have any concentration of values? 


Note:The leaves are to the right of the decimal. 


Solution: 


The value 12.3 may be an outlier. Values appear to concentrate at 3 
and 4 kilometers. 


Stem Leaf 

1 15 

Z Sew 

3 23 35.0 
4 025578 
5 56 

6 5! 

oy 


Stem Leaf 


11 


12 3 


Another type of graph that is useful for specific data values is a line graph. 
In the particular line graph shown in the example, the x-axis consists of 
data values and the y-axis consists of frequency points. The frequency 
points are connected. 


Example: 
In a survey, 40 mothers were asked how many times per week a teenager 


must be reminded to do his/her chores. The results are shown in the table 
and the line graph. 


Number of times teenager is reminded Frequency 
0 2 


1 fs) 


Number of times teenager is reminded Frequency 


2. 8 
2 14 
4 7 
5 4 
16 
14 
12 
10 
Frequency & 
6 
4 
2 
0 
0 1 2 3 4 5 6 
Number of Times Teenager is 
Reminded 


Bar graphs consist of bars that are separated from each other. The bars can 
be rectangles or they can be rectangular boxes and they can be vertical or 
horizontal. 


The bar graph shown in Example 4 has age groups represented on the x- 
axis and proportions on the y-axis. 


Example: 

By the end of 2011, in the United States, Facebook had over 146 million 
users. The table shows three age groups, the number of users in each age 
group and the proportion (%) of users in each age group. Source: 
http://www.kenburbary.com/2011/03/facebook-demographics-revisited- 
2011-statistics-2/ 


Age Number of Proportion (“%) of 
groups Facebook users Facebook users 
13-25 65,082,280 45% 
26 - 44 53,300,200 36% 
45 - 64 27,885,100 19% 
50 
Ages 45 


Proportion(%) 25 


Example: 

The columns in the table below contain the race/ethnicity of U.S. Public 
Schools: High School Class of 2011, percentages for the Advanced 
Placement Examinee Population for that class and percentages for the 
Overall Student Population. The 3-dimensional graph shows the 
Race/Ethnicity of U.S. Public Schools (qualitative data) on the x-axis and 
Advanced Placement Examinee Population percentages on the y-axis. 
(Source: http://www.collegeboard.com and Source: 
http://apreport.collegeboard.org/goals-and-findings/promoting-equity) 


Overall 

AP Examinee Student 
Race/Ethnicity Population Population 
1 = Asian, Asian 
American or Pacific 10.3% 5.7% 
Islander 
2= Black or African 9.0% 14.7% 
American 
3 = Hispanic or Latino 17.0% 17.6% 
4 = American Indian or x 5 
Alaska Native es ee 
5 = White 57.1% 59.2% 


6 = Not reported/other 6.0% 1.7% 


Ethnicity/Race vs. Percent of AP 
Examinees 


57.1 


17 
10.3 


Go to Outcomes of Education Figure 22 for an example of a bar graph that 
shows unemployment rates of persons 25 years and older for 2009. 


Note:This book contains instructions for constructing a histogram and a 
box plot for the TI-83+ and TI-84 calculators. You can find additional 
instructions for using these calculators on the Texas Instruments (T1) 
website. 


Glossary 


Outlier 
An observation that does not fit the rest of the data. 


Histograms 

This module provides an overview of Descriptive Statistics: Histogram as a 
part of Collaborative Statistics collection (col10522) by Barbara Illowsky 
and Susan Dean. 


For most of the work you do in this book, you will use a histogram to 
display the data. One advantage of a histogram is that it can readily display 
large data sets. A rule of thumb is to use a histogram when the data set 
consists of 100 values or more. 


A histogram consists of contiguous boxes. It has both a horizontal axis and 
a vertical axis. The horizontal axis is labeled with what the data represents 
(for instance, distance from your home to school). The vertical axis is 
labeled either Frequency or relative frequency. The graph will have the 
same shape with either label. The histogram (like the stemplot) can give 
you the shape of the data, the center, and the spread of the data. (The next 
section tells you how to calculate the center and the spread.) 


The relative frequency is equal to the frequency for an observed value of 
the data divided by the total number of data values in the sample. (In the 
chapter on Sampling and Data, we defined frequency as the number of 
times an answer occurs.) If: 


e f = frequency 

e mn = total number of data values (or the sum of the individual 
frequencies), and 

¢ RF = relative frequency, 


then: 
Equation: 


RF = 2 
n 


For example, if 3 students in Mr. Ahab's English class of 40 students 
received from 90% to 100%, then, 


a = Gol tees Cola ce. 
f=3,n=40,andRF= — Go = 0.075 


Seven and a half percent of the students received 90% to 100%. Ninety 
percent to 100 % are quantitative measures. 


To construct a histogram, first decide how many bars or intervals, also 
called classes, represent the data. Many histograms consist of from 5 to 15 
bars or classes for clarity. Choose a starting point for the first interval to be 
less than the smallest data value. A convenient starting point is a lower 
value carried out to one more decimal place than the value with the most 
decimal places. For example, if the value with the most decimal places is 
6.1 and this is the smallest value, a convenient starting point is 6.05 (6.1 - 
0.05 = 6.05). We say that 6.05 has more precision. If the value with the 
most decimal places is 2.23 and the lowest value is 1.5, a convenient 
Starting point is 1.495 (1.5 - 0.005 = 1.495). If the value with the most 
decimal places is 3.234 and the lowest value is 1.0, a convenient starting 
point is 0.9995 (1.0 - .0005 = 0.9995). If all the data happen to be integers 
and the smallest value is 2, then a convenient starting point is 1.5 (2 - 0.5 = 
1.5). Also, when the starting point and other boundaries are carried to one 
additional decimal place, no data value will fall on a boundary. 


Example: 

The following data are the heights (in inches to the nearest half inch) of 
100 male semiprofessional soccer players. The heights are continuous data 
since height is measured. 

60 60.5 61 61 61.5 

63.5 63.5 63.5 

64 64 64 64 64 64 64 64.5 64.5 64.5 64.5 64.5 64.5 64.5 64.5 

66 66 66 66 66 66 66 66 66 66 66.5 66.5 66.5 66.5 66.5 66.5 66.5 66.5 
66:5°66.5.56.0.67 67 67 67 67 67 67.67 67 67 67 57 67.5679 67.5675 
OO Osc 0723 

66-08 69:69 16959 69 69'69'69 6969. 69.5 69.5-69.9 0015 Gor 

7070-70 7070 70;70.5: 7059 70.9 71 71 7A 
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74 


The smallest data value is 60. Since the data with the most decimal places 
has one decimal (for instance, 61.5), we want our starting point to have two 
decimal places. Since the numbers 0.5, 0.05, 0.005, etc. are convenient 
numbers, use 0.05 and subtract it from 60, the smallest value, for the 
convenient starting point. 

60 - 0.05 = 59.95 which is more precise than, say, 61.5 by one decimal 
place. The starting point is, then, 59.95. 

The largest value is 74. 74+ 0.05 = 74.05 is the ending value. 

Next, calculate the width of each bar or class interval. To calculate this 
width, subtract the starting point from the ending value and divide by the 
number of bars (you must choose the number of bars you desire). Suppose 
you choose 8 bars. 

Equation: 


74.05 — 59.95 


— eG 
8 


Note: We will round up to 2 and make each bar or class interval 2 units 
wide. Rounding up to 2 is one way to prevent a value from falling on a 
boundary. Rounding to the next number is necessary even if it goes 
against the standard rules of rounding. For this example, using 1.76 as the 
width would also work. 


The boundaries are: 


* 59.95 

Li trayw elo sae nee on Less) 
Lda ee eae ak oo} 9/8) 
S26) Joe too 
© Ga oee 7195 
LAV rehe sae eee fa) 
eeOo ore = liao 
e 71.95 + 2 = 73.95 


73952 = / aos 


The heights 60 through 61.5 inches are in the interval 59.95 - 61.95. The 
heights that are 63.5 are in the interval 61.95 - 63.95. The heights that are 
64 through 64.5 are in the interval 63.95 - 65.95. The heights 66 through 
67.5 are in the interval 65.95 - 67.95. The heights 68 through 69.5 are in 
the interval 67.95 - 69.95. The heights 70 through 71 are in the interval 
69.95 - 71.95. The heights 72 through 73.5 are in the interval 71.95 - 
73.95. The height 74 is in the interval 73.95 - 75.95. 

The following histogram displays the heights on the x-axis and relative 
frequency on the y-axis. 


Relative 
Frequency 


04 


59.95 61.95 63.95 65.95 67.95 69.95 71.95 73.95 75.95 


Heights 


Example: 
The following data are the number of books bought by 50 part-time college 
students at ABC College. The number of books is discrete data since books 


are counted. 

SoD ene Reale Aly hela 

Dee ee 

Sicushenershe mel oeomelogele roi, 

444444 

Deore 

66 

Eleven students buy 1 book. Ten students buy 2 books. Sixteen students 
buy 3 books. Six students buy 4 books. Five students buy 5 books. Two 
students buy 6 books. 

Because the data are integers, subtract 0.5 from 1, the smallest data value 
and add 0.5 to 6, the largest data value. Then the starting point is 0.5 and 
the ending value is 6.5. 

Exercise: 


Problem: 


Next, calculate the width of each bar or class interval. If the data are 
discrete and there are not too many different values, a width that 
places the data values in the middle of the bar or class interval is the 
most convenient. Since the data consist of the numbers 1, 2, 3, 4, 5, 6 
and the starting point is 0.5, a width of one places the 1 in the middle 
of the interval from 0.5 to 1.5, the 2 in the middle of the interval from 
1.5 to 2.5, the 3 in the middle of the interval from 2.5 to 3.5, the 4 in 
the middle of the interval from to , the 5 in the 
middle of the interval from to , and the in 
the middle of the interval from to 


Solution: 


e 3.5t0 4.5 
e 4.5t05.5 
° 6 

e 5.5 to 6.5 


Calculate the number of bars as follows: 
Equation: 


6.5—0.5 _ 1 
bars 


where 1 is the width of a bar. Therefore, bars = 6. 
The following histogram displays the number of books on the x-axis and 


the frequency on the y-axis. 
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Number of Books 


Using the TI-83, 83+, 84, 84+ Calculator Instructions 

Go to the Appendix (14:Appendix) in the menu on the left. There are 
calculator instructions for entering data and for creating a customized 
histogram. Create the histogram for Example 2. 


e Press Y=. Press CLEAR to clear out any equations. 

e Press STAT 1:EDIT. If L1 has data in it, arrow up into the name L1, 
press CLEAR and arrow down. If necessary, do the same for L2. 

e Into L1, enter 1, 2, 3, 4, 5,6 

e Into L2, enter 11, 10, 16, 6, 5, 2 

e Press WINDOW. Make Xmin = .5, Xmax = 6.5, Xscl = (6.5 - .5)/6, 
Ymin = -1, Ymax = 20, Yscl = 1, Xres = 1 


e Press 2nd Y=. Start by pressing 4:Plotsoff ENTER. 

e Press 2nd Y=. Press 1:Plot1. Press ENTER. Arrow down to TYPE. 
Arrow to the 3rd picture (histogram). Press ENTER. 

e Arrow down to Xlist: Enter L1 (2nd 1). Arrow down to Freq. Enter L2 
(2nd 2). 

e Press GRAPH 

e Use the TRACE key and the arrow keys to examine the histogram. 


Optional Collaborative Exercise 


Count the money (bills and change) in your pocket or purse. Your instructor 
will record the amounts. As a class, construct a histogram displaying the 
data. Discuss how many intervals you think is appropriate. You may want to 
experiment with the number of intervals. Discuss, also, the shape of the 
histogram. 


Record the data, in dollars (for example, 1.25 dollars). 


Construct a histogram. 


Glossary 


Frequency 
The number of times a value of the data occurs. 


Relative Frequency 
The ratio of the number of times a value of the data occurs in the set of 
all outcomes to the number of all outcomes. 


Measures of the Center of the Data 
This chapter discusses measuring descriptive statistical information using the 
center of the data 


The "center" of a data set is also a way of describing location. The two most 
widely used measures of the "center" of the data are the mean (average) and the 
median. To calculate the mean weight of 50 people, add the 50 weights 
together and divide by 50. To find the median weight of the 50 people, order 
the data and find the number that splits the data into two equal parts (previously 
discussed under box plots in this chapter). The median is generally a better 
measure of the center when there are extreme values or outliers because it is not 
affected by the precise numerical values of the outliers. The mean is the most 
common measure of the center. 


Note:The words "mean" and "average" are often used interchangeably. The 
substitution of one word for the other is common practice. The technical term 
is "arithmetic mean" and "average" is technically a center location. However, 
in practice among non-statisticians, "average" is commonly accepted for 
"arithmetic mean.” 


The mean can also be calculated by multiplying each distinct value by its 
frequency and then dividing the sum by the total number of data values. The 
letter used to represent the sample mean is an x with a bar over it (pronounced 
"ee Dar") Z; 


The Greek letter jz (pronounced "mew" ) represents the population mean. One 
of the requirements for the sample mean to be a good estimate of the population 
mean is for the sample taken to be truly random. 


To see that both ways of calculating the mean are the same, consider the 
sample: 


11122344444 
Equation: 


14+1414+24+24344444+44444 | 
11 a 


2.7 


— 


Equation: 


3x14+2x241x3+5x4 © 
11 > 


2.0 


i 


In the second calculation for the sample mean, the frequencies are 3, 2, 1, and 
De 

You can quickly find the location of the median by using the expression — 
The letter n is the total number of data values in the sample. If n is an odd 
number, the median is the middle value of the ordered data (ordered smallest to 
largest). If n is an even number, the median is equal to the two middle values 
added together and divided by 2 after the data has been ordered. For example, if 
the total number of data values is 97, then oa = —— = 49. The median is the 
49th value in the ordered data. If the total number of data values is 100, then 
igs = we = 50.5. The median occurs midway between the 50th and 51st 
values. The location of the median and the value of the median are not the 
same. The upper case letter (V is often used to represent the median. The next 
example illustrates the location of the median and the value of the median. 


Example: 
Exercise: 


Problem: 


AIDS data indicating the number of months an AIDS patient lives after 
taking a new antibody drug are as follows (smallest to largest): 


3488 1011121314151516161717182122222424252626272729293132333 
33434353740444447 


Calculate the mean and the median. 


Solution: 


The calculation for the mean is: 


p= [3+4+(8)(2)+10+11+12+13+414+(15)(2)+(16)(2)+...+35+37+40+ (44) (2)+47| 
ae 40 


= 23.6 


To find the median, M, first use the formula for the location. The location 
is: 


far —. sel __ 
oR 


Starting at the smallest value, the median is located between the 20th and 
21st values (the two 24s): 


34881011121314151516161717182122222424 
25262627272929313233333434353740444447 


M= — =n 


The median is 24. 


Using the TI-83,83+,84, 84+ Calculators 
Calculator Instructions are located in the menu item 14:Appendix (Notes for 
the TI-83, 83+, 84, 84+ Calculators). 


e Enter data into the list editor. Press STAT 1:EDIT 

e Put the data values in list L1. 

e Press STAT and arrow to CALC. Press 1:1-VarStats. Press 2nd 1 for L1 
and ENTER. 

e Press the down and up arrow keys to scroll. 


e = 23.6, M = 24 


Example: 
Exercise: 


Problem: 
Suppose that, in a small town of 50 people, one person earns $5,000,000 


per year and the other 49 each earn $30,000. Which is the better measure 
of the "center," the mean or the median? 


Solution: 


5000000+49x30000 __ 129400 
50 


M = 30000 


i. = 


(There are 49 people who earn $30,000 and one person who earns 
$5,000,000.) 


The median is a better measure of the "center" than the mean because 49 
of the values are 30,000 and one is 5,000,000. The 5,000,000 is an outlier. 
The 30,000 gives us a better sense of the middle of the data. 


Another measure of the center is the mode. The mode is the most frequent 
value. If a data set has two values that occur the same number of times, then the 
set is bimodal. 


Example: 

Statistics exam scores for 20 students are as follows 
Statistics exam scores for 20 students are as follows: 
5055.59 59°63 63 72 72 72 72 72:76 7861 63:64 6484 90153 
Exercise: 


Problem:Find the mode. 
Solution: 


The most frequent score is 72, which occurs five times. Mode = 72. 


Example: 
Five real estate exam scores are 430, 430, 480, 480, 495. The data set is 
bimodal because the scores 430 and 480 each occur twice. 


When is the mode the best measure of the "center"? Consider a weight loss 
program that advertises a mean weight loss of six pounds the first week of the 
program. The mode might indicate that most people lose two pounds the first 
week, making the program less appealing. 


Note:The mode can be calculated for qualitative data as well as for 
quantitative data. 


Statistical software will easily calculate the mean, the median, and the mode. 
Some graphing calculators can also make these calculations. In the real world, 
people make these calculations using software. 


The Law of Large Numbers and the Mean 


The Law of Large Numbers says that if you take samples of larger and larger 
size from any population, then the mean Z of the sample is very likely to get 
closer and closer to p. This is discussed in more detail in The Central Limit 
Theorem. 


Note:The formula for the mean is located in the Summary of Formulas section 
course. 


Sampling Distributions and Statistic of a Sampling Distribution 


You can think of a sampling distribution as a relative frequency distribution 
with a great many samples. (See Sampling and Data for a review of relative 
frequency). Suppose thirty randomly selected students were asked the number 
of movies they watched the previous week. The results are in the relative 
frequency table shown below. 


# of movies Relative Frequency 


0 


5/30 


15/30 


6/30 


4/30 


1/30 


If you let the number of samples get very large (say, 300 million or more), 
the relative frequency table becomes a relative frequency distribution. 


A statistic is a number calculated from a sample. Statistic examples include the 
mean, the median and the mode as well as others. The sample mean Z is an 
example of a statistic which estimates the population mean . 


Glossary 


Mean 


A number that measures the central tendency. A common name for mean 
is ‘average.’ The term 'mean' is a shortened form of ‘arithmetic mean.' By 


definition, the mean for a sample (denoted by 2) is 


_ Sum of all values in th 1 , 
= ugbecot oslucsin theasmcle® and the mean for a population (denoted by 


) isu = Sum of all values in the population 
be /t = Number of values in the population * 


Median 


A number that separates ordered data into halves. Half the values are the 
same number or smaller than the median and half the values are the same 
number or larger than the median. The median may or may not be part of 
the data. 


Mode 


The value that appears most frequently in a set of data. 


Skewness and the Mean, Median, and Mode 
Consider the following data set: 
456667777778836910 


This data set produces the histogram shown below. Each interval has width 
one and each value is located in the middle of an interval. 


4 5 6 7 8 9 10 


The histogram displays a symmetrical distribution of data. A distribution is 
symmetrical if a vertical line can be drawn at some point in the histogram 
such that the shape to the left and the right of the vertical line are mirror 
images of each other. The mean, the median, and the mode are each 7 for 
these data. In a perfectly symmetrical distribution, the mean and the 
median are the same. This example has one mode (unimodal) and the 
mode is the same as the mean and median. In a symmetrical distribution 
that has two modes (bimodal), the two modes would be different from the 
mean and median. 


The histogram for the data: 
4566677778 


is not symmetrical. The right-hand side seems "chopped off" compared to 
the left side. The shape distribution is called skewed to the left because it is 
pulled out to the left. 


4 5 6 7 8 


The mean is 6.3, the median is 6.5, and the mode is 7. Notice that the 
mean is less than the median and they are both less than the mode. The 
mean and the median both reflect the skewing but the mean more so. 


The histogram for the data: 
677778886910 


is also not symmetrical. It is skewed to the right. 


6 7 8 9 10 


The mean is 7.7, the median is 7.5, and the mode is 7. Of the three statistics, 
the mean is the largest, while the mode is the smallest. Again, the mean 
reflects the skewing the most. 


To summarize, generally if the distribution of data is skewed to the left, the 
mean is less than the median, which is often less than the mode. If the 
distribution of data is skewed to the right, the mode is often less than the 
median, which is less than the mean. 


Skewness and symmetry become important when we discuss probability 
distributions in later chapters. 


Measures of the Spread of the Data 

Descriptive Statistics: Measuring the Spread of Data explains standard deviation as a measure of variation in data 
and is part of the collection col10555 written by Barbara Illowsky and Susan Dean. Roberta Bloom made 
contributions that helped to clarify the standard deviation and the variance. 


An important characteristic of any set of data is the variation in the data. In some data sets, the data values are 
concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. 
The most common measure of variation, or spread, is the standard deviation. 


The standard deviation is a number that measures how far data values are from their mean. 
The standard deviation 


e provides a numerical measure of the overall amount of variation in a data set 
e can be used to determine whether a particular data value is close to or far from the mean 


The standard deviation provides a measure of the overall variation in a data set 

The standard deviation is always positive or 0. The standard deviation is small when the data are all concentrated 
close to the mean, exhibiting little variation or spread. The standard deviation is larger when the data values are 
more spread out from the mean, exhibiting more variation. 


Suppose that we are studying waiting times at the checkout line for customers at supermarket A and supermarket 
B; the average wait time at both markets is 5 minutes. At market A, the standard deviation for the waiting time is 2 
minutes; at market B the standard deviation for the waiting time is 4 minutes. 


Because market B has a higher standard deviation, we know that there is more variation in the waiting times at 
market B. Overall, wait times at market B are more spread out from the average; wait times at market A are more 
concentrated near the average. 


The standard deviation can be used to determine whether a data value is close to or far from the mean. 
Suppose that Rosa and Binh both shop at Market A. Rosa waits for 7 minutes and Binh waits for 1 minute at the 
checkout counter. At market A, the mean wait time is 5 minutes and the standard deviation is 2 minutes. The 
standard deviation can be used to determine whether a data value is close to or far from the mean. 


Rosa waits for 7 minutes: 


e 7 is 2 minutes longer than the average of 5; 2 minutes is equal to one standard deviation. 
e Rosa's wait time of 7 minutes is 2 minutes longer than the average of 5 minutes. 
e Rosa's wait time of 7 minutes is one standard deviation above the average of 5 minutes. 


Binh waits for 1 minute. 


e 1is 4 minutes less than the average of 5; 4 minutes is equal to two standard deviations. 

e Binh's wait time of 1 minute is 4 minutes less than the average of 5 minutes. 

e Binh's wait time of 1 minute is two standard deviations below the average of 5 minutes. 

e A data value that is two standard deviations from the average is just on the borderline for what many 
statisticians would consider to be far from the average. Considering data to be far from the mean if it is more 
than 2 standard deviations away is more of an approximate "rule of thumb" than a rigid rule. In general, the 
shape of the distribution of the data affects how much of the data is further away than 2 standard deviations. 
(We will learn more about this in later chapters.) 


The number line may help you understand standard deviation. If we were to put 5 and 7 on a number line, 7 is to 
the right of 5. We say, then, that 7 is one standard deviation to the right of 5 because 
5 + (1)(2) =7. 


If 1 were also part of the data set, then 1 is two standard deviations to the left of 5 because 
5 + (-2)(2) =1. 


—_———— I 
0 123 45 67 


e In general, a value = mean + (#ofSTDEV)(standard deviation) 

e where #0fSTDEVs = the number of standard deviations 

e 7 is one standard deviation more than the mean of 5 because: 7=5+(1)(2) 
e 1is two standard deviations less than the mean of 5 because: 1=5+(—2)(2) 


The equation value = mean + (#ofSTDEVs)(standard deviation) can be expressed for a sample and for a 
population: 


¢ sample: z = x + (##ofSTDEV)(s) 
¢ Population: « = + (##ofSTDEV)(o) 


The lower case letter s represents the sample standard deviation and the Greek letter o (sigma, lower case) 
represents the population standard deviation. 


The symbol a is the sample mean and the Greek symbol yu is the population mean. 


Calculating the Standard Deviation 

If x is a number, then the difference "x - mean" is called its deviation. In a data set, there are as many deviations 
as there are items in the data set. The deviations are used to calculate the standard deviation. If the numbers belong 
to a population, in symbols a deviation is x — yw . For sample data, in symbols a deviation is z— z . 


The procedure to calculate the standard deviation depends on whether the numbers are the entire population or are 
data from a sample. The calculations are similar, but not identical. Therefore the symbol used to represent the 
standard deviation depends on whether it is calculated from a population or a sample. The lower case letter s 
represents the sample standard deviation and the Greek letter o (sigma, lower case) represents the population 
standard deviation. If the sample has the same characteristics as the population, then s should be a good estimate 
of o. 


To calculate the standard deviation, we need to calculate the variance first. The variance is an average of the 
squares of the deviations (the x— x values for a sample, or the x — pu values for a population). The symbol o? 
represents the population variance; the population standard deviation o is the square root of the population 
variance. The symbol s? represents the sample variance; the sample standard deviation s is the square root of the 
sample variance. You can think of the standard deviation as a special average of the deviations. 


If the numbers come from a census of the entire population and not a sample, when we calculate the average of 
the squared deviations to find the variance, we divide by N, the number of items in the population. If the data are 
from a sample rather than a population, when we calculate the average of the squared deviations, we divide by n- 
1, one less than the number of items in the sample. You can see that in the formulas below. 


Formulas for the Sample Standard Deviation 
"Sie _ 2 2 
» = [BEE ore = EE 
e For the sample standard deviation, the denominator is n-1, that is the sample size MINUS 1. 
Formulas for the Population Standard Deviation 


2 2 
ome J FEW) op g = J Et(e-w) 
¢ For the population standard deviation, the denominator is N, the number of items in the population. 


In these formulas, f represents the frequency with which a value appears. For example, if a value appears once, f 
is 1. If a value appears three times in the data set or population, f is 3. 


Sampling Variability of a Statistic 

The statistic of a sampling distribution was discussed in Descriptive Statistics: Measuring the Center of the 
Data. How much the statistic varies from one sample to another is known as the sampling variability of a 
statistic. You typically measure the sampling variability of a statistic by its standard error. The standard error of 
the mean is an example of a standard error. It is a special standard deviation and is known as the standard 
deviation of the sampling distribution of the mean. You will cover the standard error of the mean in The Central 


Limit Theorem (not now). The notation for the standard error of the mean is a where o is the standard 


deviation of the population and n is the size of the sample. 


Note: In practice, USE A CALCULATOR OR COMPUTER SOFTWARE TO CALCULATE THE 
STANDARD DEVIATION. If you are using a TI-83,83+,84+ calculator, you need to select the appropriate 
standard deviation o, or s; from the summary statistics. We will concentrate on using and interpreting the 
information that the standard deviation gives us. However you should study the following step-by-step example to 
help you understand how the standard deviation measures variation from the mean. 


Example: 

In a fifth grade class, the teacher was interested in the average age and the sample standard deviation of the ages 
of her students. The following data are the ages fora SAMPLE of n = 20 fifth grade students. The ages are 
rounded to the nearest half year: 

8) 925 G5 IO WO) WO) WO) WO 3 WO! MOS Oss) Hi TL a AT aU aL TL. '55 TL 5) TL 5) 

Equation: 


9+95x2+10x4+10.5x4+11x6411.5~x3 
— 


= 10.525 
20 


The average age is 10.53 years, rounded to 2 places. 
The variance may be calculated by using a table. Then the standard deviation is calculated by taking the square 
root of the variance. We will explain the parts of the table after calculating s. 


Data Freq. Deviations Deviations?” (Freq.)(Deviations’) 

x ud (x — x) (esa Ce =a) 

9 1 9 — 10.525 = —1.525 (—1.525)? = 2.325625 1 x 2.325625 = 2.325625 
9.5 2 9.5 — 10.525 = —1.025 (—1.025)” = 1.050625 2 x 1.050625 = 2.101250 
10 4 10 — 10.525 = —0.525 (—0.525)” = 0.275625 4 x .275625 = 1.1025 
10.5 4 10.5 — 10.525 = —0.025 (—0.025)” = 0.000625 4 x .000625 = .0025 

11 6 11 — 10.525 = 0.475 (0.475)? = 0.225625 6 x .225625 = 1.35375 


Data Freq. Deviations Deviations? (Freq.)(Deviations’) 


11.5 3 11.5 — 10.525 = 0.975 (0.975)” = 0.950625 3 x .950625 = 2.851875 


The sample variance, s”, is equal to the sum of the last column (9.7375) divided by the total number of data 
values minus one (20 - 1): 

s? = S85 — 0.5125 

The sample standard deviation s is equal to the square root of the sample variance: 

s = V0.5125 =. 0715891 Rounded to two decimal places, s = 0.72 

Typically, you do the calculation for the standard deviation on your calculator or computer. The 
intermediate results are not rounded. This is done for accuracy. 

Exercise: 


Problem: Verify the mean and standard deviation calculated above on your calculator or computer. 


Solution: 
Using the TI-83,83+,84+ Calculators 


e Enter data into the list editor. Press STAT 1:EDIT. If necessary, clear the lists by arrowing up into the 
name. Press CLEAR and arrow down. 

e Put the data values (9, 9.5, 10, 10.5, 11, 11.5) into list L1 and the frequencies (1, 2, 4, 4, 6, 3) into list 

L2. Use the arrow keys to move around. 

Press STAT and arrow to CALC. Press 1:1-VarStats and enter L1 (2nd 1), L2 (2nd 2). Do not forget the 

comma. Press ENTER. 

e £=10.525 

Use Sx because this is sample data (not a population): Sx=0.715891 


For the following problems, recall that value = mean + (#ofSTDEVs)(standard deviation) 
For a sample: z = x + (#ofSTDEVs)(s) 

For a population: x = p + (#ofSTDEVs)( oc) 

For this example, use xz = x + (#ofSTDEVs)(s) because the data is from a sample 


Exercise: 
Problem: Find the value that is 1 standard deviation above the mean. Find (x + 1s). 


Solution: 


(x + 1s) = 10.53 + (1)(0.72) = 11.25 


Exercise: 
Problem: Find the value that is two standard deviations below the mean. Find (x — 2s). 


Solution: 


(x — 2s) = 10.53 — (2)(0.72) = 9.09 


Exercise: 


Problem: Find the values that are 1.5 standard deviations from (below and above) the mean. 


Solution: 


° (x —1.5s) = 10.53 — (1.5)(0.72) = 9.45 
° (« +1.5s) = 10.53 + (1.5)(0.72) = 11.61 


Explanation of the standard deviation calculation shown in the table 

The deviations show how spread out the data are about the mean. The data value 11.5 is farther from the mean than 
is the data value 11. The deviations 0.97 and 0.47 indicate that. A positive deviation occurs when the data value is 
greater than the mean. A negative deviation occurs when the data value is less than the mean; the deviation is 
-1.525 for the data value 9. If you add the deviations, the sum is always zero. (For this example, there are n=20 
deviations.) So you cannot simply add the deviations to get the spread of the data. By squaring the deviations, you 
make them positive numbers, and the sum will also be positive. The variance, then, is the average squared 
deviation. 


The variance is a squared measure and does not have the same units as the data. Taking the square root solves the 
problem. The standard deviation measures the spread in the same units as the data. 


Notice that instead of dividing by n=20, the calculation divided by n-1=20-1=19 because the data is a sample. For 
the sample variance, we divide by the sample size minus one (n-1). Why not divide by n? The answer has to do 
with the population variance. The sample variance is an estimate of the population variance. Based on the 
theoretical mathematics that lies behind these calculations, dividing by (n-1) gives a better estimate of the 
population variance. 


Note: Your concentration should be on what the standard deviation tells us about the data. The standard deviation 
is a number which measures how far the data are spread from the mean. Let a calculator or computer do the 
arithmetic. 


The standard deviation, s or a, is either zero or larger than zero. When the standard deviation is 0, there is no 
spread; that is, the all the data values are equal to each other. The standard deviation is small when the data are all 
concentrated close to the mean, and is larger when the data values show more variation from the mean. When the 
standard deviation is a lot larger than zero, the data values are very spread out about the mean; outliers can make s 
or o very large. 


The standard deviation, when first presented, can seem unclear. By graphing your data, you can get a better "feel" 
for the deviations and the standard deviation. You will find that in symmetrical distributions, the standard deviation 
can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that 
the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the 
first quartile, the median, the third quartile, the smallest value, and the largest value. Because numbers can be 
confusing, always graph your data. 


Note:The formula for the standard deviation is at the end of the chapter. 


Example: 
Exercise: 


Problem: Use the following data (first exam scores) from Susan Dean's spring pre-calculus class: 


3342494953555561 6367686869697273 7478808388888890 929494949496100 


e aCreate a chart containing the data, frequencies, relative frequencies, and cumulative relative 
frequencies to three decimal places. 
e bCalculate the following to one decimal place using a TI-83+ or TI-84 calculator: 


iThe sample mean 

iiThe sample standard deviation 
iiiThe median 

ivThe first quartile 

vthe third quartile 

vilQR 


©). (Gb XO} ey 6) 


e cConstruct a box plot and a histogram on the same set of axes. Make comments about the box plot, the 
histogram, and the chart. 


Solution: 
ea 

Data Frequency Relative Frequency Cumulative Relative Frequency 
38) 1 0.032 0.032 
42 1 0.032 0.064 
49 2, 0.065 0.129 
53 1 0.032 0.161 
55 2: 0.065 0.226 
61 il 0.032 0.258 
63 1 0.032 0.29 
67 1 0.032 0.322 
68 2 0.065 0.387 
69 2 0.065 0.452 
72 il 0.032 0.484 
73 1 0.032 0.516 
74 1 0.032 0.548 
78 1 0.032 0.580 


80 1 0.032 0.612 


Data Frequency Relative Frequency Cumulative Relative Frequency 


83 1 0.032 0.644 

88 3 0.097 0.741 

90 1 0.032 O).7/7/3! 

92 il 0.032 0.805 

94 4 0.129 0.934 

96 il 0.032 0.966 

100 il 0.032 0.998 (Why isn't this value 1?) 
eb 

o iThe sample mean = 73.5 

o iiThe sample standard deviation = 17.9 

o iiiThe median = 73 

© ivThe first quartile = 61 

o vThe third quartile = 90 

© vilIQR = 90 - 61 = 29 


e cThe x-axis goes from 32.5 to 100.5; y-axis goes from -2.4 to 15 for the histogram; number of intervals 
is 5 for the histogram so the width of an interval is (100.5 - 32.5) divided by 5 which is equal to 13.6. 
Endpoints of the intervals: starting point is 32.5, 32.5+13.6 = 46.1, 46.1+13.6 = 59.7, 59.7+13.6 = 73.3, 
73.3+13.6 = 86.9, 86.9+13.6 = 100.5 = the ending value; No data values fall on an interval boundary. 


The long left whisker in the box plot is reflected in the left side of the histogram. The spread of the exam scores in 
the lower 50% is greater (73 - 33 = 40) than the spread in the upper 50% (100 - 73 = 27). The histogram, box plot, 
and chart all reflect this. There are a substantial number of A and B grades (80s, 90s, and 100). The histogram 
clearly shows this. The box plot shows us that the middle 50% of the exam scores (IQR = 29) are Ds, Cs, and Bs. 
The box plot also shows us that the lower 25% of the exam scores are Ds and Fs. 


Comparing Values from Different Data Sets 
The standard deviation is useful when comparing data values that come from different data sets. If the data sets 
have different means and standard deviations, it can be misleading to compare the data values directly. 


e For each data value, calculate how many standard deviations the value is away from its mean. 
e Use the formula: value = mean + (#ofSTDEVs)(standard deviation); solve for #ofSTDEVs. 
#ofSTDEVs _ value—mean 


standard deviation 
¢ Compare the results of this calculation. 


#ofSTDEVs is often called a "z-score"; we can use the symbol z. In symbols, the formulas become: 


= — &£-2z 
Sample L=XH+ZS a ae 
: _ _ op 
Population L=p~+zo z= 
Example: 
Exercise: 
Problem: 


Two students, John and Ali, from different high schools, wanted to find out who had the highest G.P.A. when 
compared to his school. Which student had the highest G.P.A. when compared to his school? 


Student GPA School Mean GPA School Standard Deviation 
John 2.85 3.0 0.7 
Ali 77 80 10 

Solution: 


For each student, determine how many standard deviations (#ofSTDEVs) his GPA is away from the average, 
for his school. Pay careful attention to signs when comparing and interpreting the answer. 


#ofSTDEVs = value—mean [c= TU 


standard deviation o 
For John, z = ##ofSTDEVs = 243-3° — —0.21 
For Ali, z = #ofSTDEVs = Ts = —0.3 


John has the better G.P.A. when compared to his school because his G.P.A. is 0.21 standard deviations below 
his school's mean while Ali's G.P.A. is 0.3 standard deviations below his school's mean. 


John's z-score of —0.21 is higher than Ali's z-score of —0.3 . For GPA, higher values are better, so we 
conclude that John has the better GPA when compared to his school. 


The following lists give a few facts that provide a little more insight into what the standard deviation tells us about 
the distribution of the data. 
For ANY data set, no matter what the distribution of the data is: 


e At least 75% of the data is within 2 standard deviations of the mean. 

e At least 89% of the data is within 3 standard deviations of the mean. 

e At least 95% of the data is within 4 1/2 standard deviations of the mean. 
e This is known as Chebyshev's Rule. 


For data having a distribution that is MOUND-SHAPED and SYMMETRIC: 


e Approximately 68% of the data is within 1 standard deviation of the mean. 

e Approximately 95% of the data is within 2 standard deviations of the mean. 

¢ More than 99% of the data is within 3 standard deviations of the mean. 

e This is known as the Empirical Rule. 

e It is important to note that this rule only applies when the shape of the distribution of the data is mound- 
shaped and symmetric. We will learn more about this when studying the "Normal" or "Gaussian" probability 
distribution in later chapters. 


**With contributions from Roberta Bloom 


Glossary 


Standard Deviation 
A number that is equal to the square root of the variance and measures how far data values are from their 
mean. Notation: s for sample standard deviation and o for population standard deviation. 


Variance 
Mean of the squared deviations from the mean. Square of the standard deviation. For a set of data, a deviation 
can be represented as x — x where z is a value of the data and z is the sample mean. The sample variance is 
equal to the sum of the squares of the deviations divided by the difference of the sample size and 1. 


Summary of Formulas 
A summary of useful formulas used in examining descriptive statistics 
Commonly Used Symbols 


e The symbol »’ means to add or to find the sum. 

¢ n= the number of data values in a sample 

e N =the number of people, things, etc. in the population 
e Z = the sample mean 

e s =the sample standard deviation 

e y= the population mean 

e o =the population standard deviation 

e f = frequency 

e x = numerical value 


Commonly Used Expressions 


e «*f =A value multiplied by its respective frequency 

e \\ a = The sum of the values 

e \\a*f = The sum of values multiplied by their respective frequencies 

e (x — &) or (x — p) = Deviations from the mean (how far a value is 
from the mean) 

e (x — 2%)” or (x — p)” = Deviations squared 


f(x - z)° or f(x — py)? = The deviations squared and multiplied by 
their frequencies 


Mean Formulas: 


Formulas Relating a Value, the Mean, and the Standard Deviation: 


e value = mean + (#ofSTDEVs)(standard deviation), where #ofSTDEVs 
= the number of standard deviations 

e x = 2+ (HofSTDEVs)(s) 

e x= p+ (#ofSTDEVs)(c) 


Practice 1: Center of the Data 

This module provides students with opportunities to apply concepts related 
to descriptive statistics. Students are asked to take a set of sample data and 
calculate a series of statistical values for that data. 


Student Learning Outcomes 


e The student will calculate and interpret the center, spread, and location 
of the data. 
e The student will construct and interpret histograms an box plots. 


Given 


Sixty-five randomly selected car salespersons were asked the number of 
cars they generally sell in one week. Fourteen people answered that they 
generally sell three cars; nineteen generally sell four cars; twelve generally 
sell five cars; nine generally sell six cars; eleven generally sell seven cars. 


Complete the Table 


Cumulative 
Data Value Relative Relative 
(# cars) Frequency Frequency Frequency 


Discussion Questions 


Exercise: 


Problem: What does the frequency column sum to? Why? 


Solution: 
65 


Exercise: 


Problem: What does the relative frequency column sum to? Why? 


Solution: 


1 
Exercise: 


Problem: 


What is the difference between relative frequency and frequency for 
each data value? 


Exercise: 


Problem: 


What is the difference between cumulative relative frequency and 
relative frequency for each data value? 


Enter the Data 


Enter your data into your calculator or computer. 


Construct a Histogram 


Determine appropriate minimum and maximum x and y values and the 
scaling. Sketch the histogram below. Label the horizontal and vertical axes 
with words. Include numerical scaling. 


Data Statistics 


Calculate the following values: 
Exercise: 


Problem: Sample mean = x = 


Solution: 


4.75 


Exercise: 


Problem: Sample standard deviation = s, = 


Solution: 


1.39 


Exercise: 


Problem: Sample size = n = 


Solution: 


65 


Calculations 


Use the table in section 2.11.3 to calculate the following values: 
Exercise: 


Problem: Median = 


Solution: 
4 
Exercise: 


Problem: Mode = 


Solution: 
4 
Exercise: 


Problem: First quartile = 


Solution: 
4 
Exercise: 


Problem: Second quartile = median = 50th percentile = 


Solution: 


4 


Exercise: 


Problem: Third quartile = 


Solution: 
6 


Exercise: 


Problem: Interquartile range (IQR) = en 
Solution: 
6—4=2 

Exercise: 


Problem: 10th percentile = 


Solution: 
3 
Exercise: 


Problem: 70th percentile = 

Solution: 

6 

Exercise: 

Problem: Find the value that is 3 standard deviations: 
e aAbove the mean 
e bBelow the mean 

Solution: 


e a8.93 


e b0.58 


Box Plot 


Construct a box plot below. Use a ruler to measure and scale accurately. 


Interpretation 


Looking at your box plot, does it appear that the data are concentrated 
together, spread out evenly, or concentrated in some areas, but not in 
others? How can you tell? 


Practice 2: Spread of the Data 
Practice exercise for Descriptive Statistics 


Student Learning Outcomes 


e The student will calculate measures of the center of the data. 
e The student will calculate the spread of the data. 


Given 


The population parameters below describe the full-time equivalent number of 
students (FTES) each year at Lake Tahoe Community College from 1976-77 
through 2004-2005. (Source: Graphically Speaking by Bill King, LTCC 
Institutional Research, December 2005). 


Use these values to answer the following questions: 


e «= 1000 FTES 

¢ Median = 1014 FTES 

e g =474FTES 

e First quartile = 528.5 FTES 

e Third quartile = 1447.5 FTES 
e n= 29 years 


Calculate the Values 


Exercise: 


Problem: 


A sample of 11 years is taken. About how many are expected to have a 
FTES of 1014 or above? Explain how you determined your answer. 


Solution: 


6 


Exercise: 


Problem: 75% of all years have a FTES: 
e aAt or below: 
e bAt or above: 

Solution: 


e al447.5 
¢ b528.5 


Exercise: 


Problem: The population standard deviation = 


Solution: 


474 FTES 
Exercise: 


Problem: 


What percent of the FTES were from 528.5 to 1447.5? How do you 
know? 


Solution: 
50% 
Exercise: 
Problem: What is the IQR? What does the IQR represent? 
Solution: 


319 


Exercise: 


Problem: 
How many standard deviations away from the mean is the median? 
Solution: 


0.03 


Additional Information: The population FTES for 2005-2006 through 2010- 
2011 was given in an updated report. (Source: 
http://www.ltcc.edu/data/ResourcePDF/LTCC_FactBook_2010-11.pdf). The 
data are reported here. 


2005- 2006- 2007- 2008- 2009- 2010- 


xear 06 07 08 09 10 14 
Total 1585 1690 1735 1935 2021 1890 
FTES 

Exercise: 
Problem: 


Calculate the mean, median, standard deviation, first quartile, the third 
quartile and the IQR. Round to one decimal place. 


Solution: 


mean = 1809.3 

median = 1812.5 

standard deviation = 151.2 
First quartile = 1690 


Third quartile = 1935 
TQR = 245 


Exercise: 
Problem: 
Construct a boxplot for the FTES for 2005-2006 through 2010-2011 and 
a boxplot for the FTES for 1976-1977 through 2004-2005. 
Exercise: 
Problem: 
Compare the IQR for the FTES for 1976-77 through 2004-2005 with the 


IQR for the FTES for 2005-2006 through 2010-2011. Why do you 
suppose the IQRs are so different? 


Solution: 


Hint: Think about the number of years covered by each time period and 
what happened to higher education during those periods. 


Sets 


Set theory is about studying collection of objects. The collection may comprise 
anything or any abstraction. It can be purely abstract thing like numbers or 
abstraction of real thing like students studying in class XI in a school. The 
members of collection can be numbers, letters, titles of books, people, teachers, 
provinces — virtually anything - even other collections. Further, it need not be 
finite. For example, a set of integers has infinite members. For a set, only 
requirement is that the members of a collection are properly defined. 


Set 
A set is a collection of well defined objects. 


In other words, the member of set is clearly identifiable. The terms “object”, 
“member” or “element” mean same thing and are used interchangeably. 


How to represent a set? 


A set is denoted by capital letters like “A”, “B”, “C” etc. In choosing a symbol 
for a set, it is generally convenient to use a capital letter that identifies with the 
set. For example, it is appropriate to use symbol “V” to represent collection of 
vowels in English alphabet. 


On the other hand, the members or elements of a set are denoted by small letters 
like ta oD etc. 


39 


Membership of a set is denoted by the symbol “ €” . Its literal meaning is 
“belongs to”. If an object does not belong to a set, then we convey the same, 
using symbol “ ¢ ”. 


a € A: we read this as “a” belongs to set "A". 
a € A: we read this as “a” does not belong to set "A". 
The set is represented in two ways : 


e Roaster form 
e Set builder form 


Roaster form 


ce 99 


All elements of the set are listed with a comma (“,”) in between and the listing 
itself is enclosed within braces “{“ and “}”. The order or sequence of elements 
within the set is not important — though desirable. 


The set of numbers, which divide 12, is written as : 
A= AL 234.612 


If a pattern or sequence is easily made out, then we can use ellipsis ("...") to 
represent continuity of such pattern. This type of representation is particularly 
useful to represent an infinite set. Clearly, sequence of members in this type of 
representation is important. 


The set of even numbers is written as, 
B92 ALO. 8 oie een } 


The roaster form is limited in certain circumstance. For example, we can not 
represent set of real numbers in roaster form. Real numbers is an infinite set, but 
the elements of this set do not follow a pattern or have a particular sequence. As 
such, we can not define same with the help of ellipsis. 


Every member of the set is unique and distinct. However, we encounter 
situations in which collection can have repeated elements. For example, the set 
representing scores of three students can be a set of three numbers one of which 
is repeated : 


S = {80,80,70} 
We need to reduce such collection as : 


=> § = {80,80,70} = {80,70} 


Set builder form 


Collections are often characterized by a common property. We can, therefore, 
define members of a set in terms of the common property. However, we need to 


ensure that objects outside the collection do not have the same common 
property. 


The construction of qualification for the common property is quite flexible. 
Only thing is that we need to be explicit in what we mean. Generally, we denote 
the member by a symbol like “x” and then define the membership. Consider the 
examples : 


A = {x: x is a vowel in English alphabet} 
B= {x: xis an integer and0 < z < 10} 
The roaster equivalents of two sets are : 
A = {a,e,i,0,u} 
B = {1,2,3,4,5,6,7,8,9} 


Can we write set “B” as the one which comprises single digit natural number? 
Yes. Thus, we can see that there are indeed different ways to define and identify 
members and hence the flexibility in defining collection. 


We should be careful in using words like “and” and “or” in writing qualification 
for the set. Consider the example here : 


C= {x:x € Zand2< 2 < 4} 


Both conditional qualifications are used to determine the collection. The 
elements of the set as defined above are integers. Thus, the only member of the 
Sevis 3: 3 


Now, let us consider an example, which involves “or” in the qualification, 


C= {x:2 € Aorz € B} 


The member of this set can be elements belonging to either of two sets "A" and 
"B". The set consists of elements (i) belonging exclusively to set "A", (ii) 
elements belonging exclusively to set "B" and (iii) elements common to sets 
"A" and "B". 


Example 


Problem 1 : A set in roaster form is given as : 


2 2 2 
ate ee 
6 1 3 


Write the set in “set builder form”. 


Solution : We see here that we are dealing with natural numbers. The 
numerators are square of natural numbers in sequence. The number in 
denominator is one more than numerator for each member. We can denote 
natural number by “n”. Clearly, if numerator is “ n? ”, then denominator is 
“n+1”. Therefore, the expression that represent a member of the set is : 


n2 


n+1 


However, this set is not an infinite set. It has exactly three members. Therefore, 
we need to specify “n” so that only members of the set are exclusively denoted 
by the above expression. We see here that “n” is greater than 4, but “n” is less 
than 8. For representing three elements of the set, 


5 ae eee 


We can write the set, now, in the builder form as : 


2 
A= 2 . eS " i , where ’n” is anatural number and 5<n< 7} 
n 


In set builder form, the sequence within the range is implied. It means that we 
start with the first valid natural number and proceed sequentially till the last 
valid natural number. 


Some important sets representing numbers 


Few key number sets are used regularly in mathematical context. As we use 
these sets often, it is convenient to have predefined symbols : 


e P(prime numbers) 


N (natural numbers) 
Z, (integers) 

Q (rational numbers) 
R (real numbers) 


We put a superscript “+”, if we want to specify membership of only positive 
numbers, where appropriate." Z* ", for example, means set of positive 
integers. 


Empty set 
An empty set has no member or object. It is denoted by symbol “@” and is 
represented by a pair of braces without any member or object. 


p= {} 


The empty set is also called “null” or “void” set. For example, consider a 
definition : “the set of integer between 1 and 2”. There is no integer within this 
range. Hence, the set corresponding to this definition is an empty set. Consider 
another example : 


Bales a*=4 and xis odd} 


An odd integer squared can not be even. Hence, set “B” also does not have any 
element in it. 


There is a bit of paradox here. If the definition does not yield an element, then 
the set is not well defined. We may be tempted to say that empty set is not a set 
in the first place. However, there is a practical reason to have an empty set. It 
enables mathematical operations. We shall find many examples as we study 
operations on sets. 


Equal sets 


The members of two equal sets are exactly same. There is nothing more to it. 
However, we need to know two special aspects of this equality. We mentioned 
about repetition of elements in a set. We observed that repetition of elements 
does not change the set. Consider example here : 


A= 415 5:6:7 == 41.5,8,0 + 
Another point is that sequence does not change the set. Therefore, 
AS 115.8.) {5.08.1 


In the nutshell, when we have to compare two sets we look for distinct elements 
only. If they are same, then two sets in question are equal. 


Cardinality 


Cardinality is the numbers of elements in a set. It is denoted by modulus of set 
like |A]. 


Cardinality 
The cardinality of a set “A” is equal to numbers of elements in the set. 


The cardinality of an empty set is zero. The cardinality of a finite set is some 
positive integers. The cardinality of a number system like integers is infinity. 
Curiously, the cardinality of some infinite set can be compared. For example, 
the cardinality of natural numbers is less than that of integers. However, we can 
not make such deduction for the most case of infinite sets. 


Subsets 


The collections are generally linked in a given context. If we think of 
ourselves, then we belong to a certain society, which in turn belongs to a 
province, which in turn belongs to a country and so on. In the context of a 
school, all students of a school belong to school. Some of them belong to a 
certain class. If there are sections within a class, then some of these belong 
to a section. 


We need to have a mathematical relationship between different collections 
of similar types. In set theory, we denote this relationship with the concept 
of “subset”. 


Subset 
A set, “A” is a subset of set “B”, if each member of set “A” is also a 
member of set “B”. 


We use symbol “ C ” to denote this relationship between a “subset” and a 
“set”. Hence, 


AcB 


We read this symbolic representation as : set “A” is a subset of set “B”. We 
express the intent of relationship as : 


ACB if weEA, then cxeEB 


It is evident that set "B" is larger of the two sets. This is sometimes 
emphasized by calling set "B" as the “superset” of "A". We use the symbol 
“> ” to denote this relation : 


BOA 
If set "A" is not a subset of "B", then we write this symbolically as : 


ACB 


Important results / deductions 


Some of the important characteristics and related deductions are presented 
here : 


Equality of two sets 


It is clear that set “B” is inclusive of subset “A”. It means that “B” may 
have additional elements over and above those common with “B”. In case, 
all elements of “B” are also in “A”, then two sets are equal. We express this 
symbolically as : 


If ACB and BCA, then A=B. 


This is true in other direction as well : 
If A=B, then ACB and BCA. 
We can write two instances in a single representation as : 
ACB and BCASA=B 


The symbol “ <= ” means that relation holds in either direction. 


Relation with itself 


Every set is subset of itself. This is so because every element is present in 
itself. 


Relation with Empty set 


Empty set is a subset of every set. This deduction is direct consequence of 
the fact that empty set has no element. As such, this set is subset of all sets. 


Proper subset 


We have seen from the deductions above that special circumstance of 
“equality” can blur the distinction between “set” and “subset”. In order to 
emphasize, mother-child relation between sets, we coin the term “proper 
subset”. If every element of set “B” is not present in set “A”, then “A” is a 
“proper” subset of set “B”; otherwise not. This means that set “B” is a 
larger set, which, besides other elements, also includes all elements of set 
“A”, 


Set of vowels in English alphabet, “V”, is a “proper” subset of set of 
English alphabet, “E”. All elements of “V” are present in “E”, but not all 
elements of “E” are present in “V”. 


There is a bit of conventional differences. Some write a “proper” subset 
relation using symbol “ C ” and write symbol “ C ” to mean possibility of 
equality as well. We have chosen not to differentiate two subset types. 


Number system 


The number system is one such system, in which different number groups 
are related. Natural number is a subset of integers. integers are subset of 
rational numbers and rational numbers are subset of real numbers. None of 
these sets are equal. Hence, relations are described by proper subsets. 


NGL 
We can write the chain of relation among number sets : 


=>PCN-CZ-ECOGR 


However, irrational numbers are also subset of real numbers, but irrational 
numbers is not rational numbers. We represent this relation by emphasizing 
that rational numbers is not a subset of irrational numbers or vice-versa. We 
depict this relation as : 


Q(rational numbers) ¢ T(irrational numbers) 


But irrational numbers is subset of real numbers. The real numbers 
comprises of only two subsets at the highest level — rational and irrational. 


Therefore, irrational numbers is the remaining collection after deducting 
rational numbers from real numbers. 


Following the logic, we define set of irrational numbers as : 


T(irrational numbers) = {r:c€R and x¢Q} 


Power set 


Power set is formed of all possible subsets of a given set. It is denoted as 
P(A). 


Power set 
The collection of all subsets of a set “A” is called power set, P(A). 


For example, consider a set given by : 
A={1,3:4} 


What are the possible subsets? There are three subsets consisting of 
individual elements: {1}, {3} and {4}. Then, elements taken two at a time 
form following subsets : {1,3}, {1,4} and {3,4}. Since order or sequence 
does not matter in set representation, there are only three subsets of two 
elements taken together. Now, the elements taken three at a time form the 
only one subset : {1,3,4}. Remember, a set is a subset of itself. Further, 
empty set (@) is subset of any set. Hence, @ is also a subset of the given set 
“A”, 


The set comprising of all possible subsets of given set “A” is : 
P(A) = {9, {1}, 13}, (4, {3}, {14}, (8,44, {13,434 
We note two important points from this representation of power set : 


1: The elements of a power set are themselves sets. In other words, every 
element of a power set is a Set. 


2: If the numbers of elements (cardinality) in a set is “n”, then numbers of 
elements in power set is 2”. 


For a set having three elements, the total numbers of elements in the power 
Set is : 


Si SS 8 


We can see that this result is consistent with the illustration given above. 
We should, here, emphasize to avoid confusion that counting of elements of 
a set (cardinality) excludes empty set. It is, however, counted as members of 
power set. 


Example 


Problem 1: The finite sets “A” and “B” have “m” and “n” numbers of 
elements respectively. The total numbers of subsets of “A” is 56 more than 
the total numbers of subsets of “B”. Find “m” and “n”. 


Solution : According to relation obtained for power set, the total numbers 
of subsets of “A” and “B” are : 


Ra 
kp 2" 
According to question, 
ka— Kp= 56 
= 2 2" = 56 


We need to find two equations to find “m” and “n”. For this we seek 
expansion of “56” in terms of powers of “2”. 


56 = 8X7 = 8(8 — 1) = 2°(2° — 1) 


In order to get this form, we rearrange the expression on the LHS of the 
earlier equation as : 


SO 2r 1) 2° (2? Sa) 
Equating powers of similar base, 


m= 3. sand 72—='6 


Intervals 


Intervals is an alternative way to represent a subset of real numbers. Real 
numbers is represented by a number line having infinite membership. We 
can think any segment of this number line as subset or interval. Consider an 
interval, where “a” and “b” belongs to real numbers anda <b: 


ax<ar=< b 


The value of “x” falls between “a” and “b”. For example, an interval 2<x<4 
is a collection of all points lying between end points 2 and 4. The important 
thing is that this interval does not include end points and is called “open” 
interval. We can represent this collection as a set in “set builder form” as : 


{e:cxE€R and 2<2< 4} 


Alternatively, we can use pair of small brackets to represent open interval as 


(2,4) 


The two forms of representations are equivalent. The later form is 
obviously an easier and convenient representation of the subset of real 
number. We use small braket “(“ or “)” to denote interval that excludes end 
point. Likewise, we use square bracket “[“ or “]” to denote interval that 
includes end point. We can represent a “close” interval as [2,4]. This 
interval is equivalent to : 


24) S24 
We can have combination of “open” and “close” brackets like : 


(2,4) =2<a<4 


As a reminder, we should note that interval corresponding to real numbers 
or its subset is an infinite set as we can have infinite points on the line 
segment corresponding to an interval. 


Graphical representation 


The graphical representation uses a segment of line on the number line 
representing real numbers. The line segment is demarcated by a pair of two 
small circles — a filled circle to mean that end point is included in the 
interval and an unfilled circle to mean that end point is excluded from the 
interval. 


Let us consider a,b € R and a < b, then 
(a,b)=a=—e<6 
aoa 2<b 
(a,b.=e= 2b 


la,bl =a<a<b 


Graphically, 
Intervals 
1: (a,b) 2: [a,b) 
i a 
a Db a b 
3: (a,b) 4: [a,b] 
oe —o—e 
a b a b 


Representation on real number 
line. 


Set of real numbers 


The real numbers is represented graphically by a straight line. The question 
that we seek to be answer here is whether the set of integers is bounded by 
infinity. In other words, whether we can define interval of real numbers like 


[—0o, O°] 


The literal meaning of infinity is “unboundedness”. Infinity is considered as 
a large number, which may either be positive or negative. It does not have a 
finite (fixed) value. Infinity, therefore, is not a part of real number system. It 
does not lie on the real number line. For this reason, we can not assign 
infinity to a real variable like (though we do generally): 


w= CO 


It follows, then, that appropriate interval, representing real numbers, is open 
at both ends : 


R=-—oo < £& < co = (—cw, co) 


Interval of real numbers greater than or less than a given value 


In the interval form, we can write the set of real numbers greater than a 


Wot 


given value, "a", as: 
Or 2 OO= (4:00) 
This is equivalent to : 
LG 


The final notation "x > a" does not require to mention about infinity. It is an 
interval of real numbers greater than the given value 'a' appearing on the 


right. It is implied that it can be any large value. Similarly, the interval of 
real numbers less than a given value is : 


—co <“@<a= (-—o,a) 


<a 


Venn Diagrams (optional) 

This module introduces Venn diagrams as a method for solving some probability 
problems. This module is included in the Elementary Statistics textbook/collection as an 
optional lesson. 


A Venn diagram is a picture that represents the outcomes of an experiment. It generally 
consists of a box that represents the sample space S together with circles or ovals. The 
circles or ovals represent events. 


Example: 

Suppose an experiment has the outcomes 1, 2, 3, ... , 12 where each outcome has an 
equal chance of occurring. Let event A = {1, 2, 3, 4, 5, 6} and event B = {6, 7, 8, 9}. 
Then A AND B = {6} and A ORB = (1, 2, 3, 4, 5, 6, 7, 8, 9}. The Venn diagram is 
as follows: 


Example: 

Flip 2 fair coins. Let A = tails on the first coin. Let B = tails on the second coin. Then 
A = {TT, TH} and B = {TT, HT}. Therefore, A AND B = {TT}. 

AOR Ss == sb Haan Han 

The sample space when you flip two fair coins is S = {HH, HT, TH, TT}. The 
outcome HH is in neither A nor B. The Venn diagram is as follows: 


S 


Example: 

Forty percent of the students at a local college belong to a club and 50% work part 
time. Five percent of the students work part time and belong to a club. Draw a Venn 
diagram showing the relationships. Let C’' = student belongs to a club and PT = student 
works part time. 


S 
C and PT 


_ «s 


If a student is selected at random find 


The probability that the student belongs to a club. P(C) = 0.40. 

The probability that the student works part time. P(PT) = 0.50. 

The probability that the student belongs to a club AND works part time. 

P(C AND PT) = 0.05. 

The probability that the student belongs to a club given that the student works part 
time. 

Equation: 


P(CIPT) = P(CANDPT) _ 0.05 _ 9, 
7 P(PT) — 0.50 °° 


The probability that the student belongs to a club OR works part time. 
Equation: 


P(C OR PT) = P(C) + P(PT) — P(C AND PT) = 0.40 + 0.50 — 0.05 = 0.85 


Glossary 


Venn Diagram 


The visual representation of a sample space and events in the form of circles or ovals 
showing their intersections. 


Union of sets 


We are familiar with basic algebraic operations. These basic mathematical 
operations, however, are not valid in all contexts. For example, algebraic 
operation such as addition has different details, when operated on vectors. 
Clearly, we expect that these operations will also be not same in the case of 
sets — which are collections and not individual elements. 


Nevertheless, set operations bear resemblance to algebraic operation. For 
example, when we combine (not add) two sets, then the operation involved 
is called “union”. We can see that there is resemblance of the intent of 
addition, subtraction etc in the case of sets also. 


Venn diagrams 


Venn diagrams are pictorial representation of sets/subsets and relationship 
that the sets/subsets have among them. It helps us to analyze relationship 
and carry out valid set operations in a relatively easier manner vis — a — vis 
symbolic representation. 


Universal set 


Universal set is the largest set among collection of sets. Importantly, it is 
not the collection of everything as might be conjectured by the 
nomenclature. For example, "R", is universal set comprising of all real 
numbers. The rational numbers, integers and natural numbers are its subset. 
In other consideration, we can call integers as universal set. In that case, 
sets such as {1,2,3}, prime numbers, even numbers, odd numbers are subset 
of the universal set of integers. 


The universal set is pictorially represented by a region enclosed within a 
rectangle on Venn diagram. For illustration, consider the universal set of 
English alphabets and universal set of first 10 natural numbers as shown in 
the top row of the figure 

Universal set 


abcdefg 
hijkImno 


parstuvw 
xyz 


Cc 


The universal set is represented 
by a region enclosed within a 
rectangle. 


Many times, however, we may not be required to list elements of a 
universal set. In such case, we represent the universal set simply by a 
rectangle and the symbol for universal set, “U”, in the corner. This is 
particularly helpful, where number of elements in universal set are very 
large. 


The subsets of the universal set are represented by closed curves — usually 
circles. The subset of vowels (V) is shown here within the circle with the 
listing of elements. Note that we have not listed all the alphabets for 
universal set and used the symbol “U” in the corner only. 

Subset 


The subset of the universal set 
is represented by a closed curve 
— usually circle. 


Union of sets 


Union works on two operands, each of which is a set. The operation is 
denoted by symbol " U ". Now, the question is : what do we expect when 
two sets are combined? Clearly, we need to enlist all the elements of two 
sets in the resulting set. 


Union of two sets 
The union of sets “A” and “B” is a third set, which consists all the 
elements of two sets. 


In symbol, 
AUB={a#:2%¢E€A or xe B} 


The word “or” in the set builder form defining union is important. It means 
that the element “x” belongs to either “A” or “B”. The element may belong 
to both sets (common to two sets), but not necessarily. We can, therefore, 
infer that union set consists of : 


1. elements exclusive to A 
2. elements exclusive to B 
3. elements common to A and B 


As a Set includes only distinct elements, the common elements are 
represented only once in the union set. Thus, union set consists of elements 
of both sets without repeating an element. Now, the set is represented on 
Venn diagram as shown here. 

Union of two sets 


AB 


The representation of union on 
Venn diagram 


For illustration of working with union, let us consider two sets of positive 
integers as given here, 


A = {1,2,3,4,5,6} 
B= {45,6,7,8} 
The union of two sets is : 
SAU B={12,3,4,5,6,4,5,6,7,8} 


But repetition of elements in a set does not change it. Hence, we need not 
repeat elements in the resulting union. 


= AU B= {1,2,3,4,5,6,7,8} 


Here, universal set is natural numbers. The representation of union of joint 
sets is shown in the figure. We can observe that very construction of union 
on Venn diagram ensures that elements are not repeated. 

Union of two sets 


The representation of union on 
Venn's diagram 


Interpretation of union set 
Let us examine the defining set of union : 


AUB={#:2%¢E€A or xe Bh 


We consider an arbitrary element, say “x”, of the union set. Then, we 
interpret the conditional meaning as : 


If «cE AUB, then xEA or EB. 
Can we emphasize this conditional meaning in reverse order : 


If «EA or «ceEB, then cE AUB. 


Yes, we can agree with the second conditional meaning as well. We, 
therefore, conclude that the statements work in both ways. We write two 
statements together as : 


rEAUBSxreEeA or tEB 


We can reach yet another conclusion by observing representation of union 
set on Venn diagram. Now, if an arbitrary element “x” does not belong to 
union set, then it is clear that it does not belong to the region represented by 
the union set on the Venn’s diagram. Hence, 

Union of two sets 


AB 


The representation of union on 
Venn's diagram 


If x€ (doesnotbelong) AUBS>x¢A and «cE€B. 


The important thing to note here is the word “and” in place of “or” used 
before. Think about it. Here two conditions follow simultaneously. If an 
element does not belong to an union set, then it will not belong to either of 
individual sets simultaneously. Now, the next thing to consider is whether 
this conditional statement will be true other way round as well? 


If «€A and c€BSxEAvB. 


Yes, we can agree to this statement. We, therefore, conclude that the 
statements work in both ways. We write two statements together with the 
help of two ways arrow sign as : 


tx€AUBSzEéA and «EB. 


Union of disjoint sets 


Consider students in class X and class XI. Let us denote the respective sets 
as "T" for tenth and "E" for eleventh class. Clearly, union i.e. combination 
of two sets should include elements from each of the sets. Hence, 


TU E = students in class X and XI 


This is a straight forward union of two sets. The resulting set comprises of 
all elements present in both the sets. Since it is not possible that students 
studying in class X are also students of XI, we are sure that the numbers of 
elements in the union is sum of numbers of students in each class. As there 
is no commonality between two sets, it is a union of two “disjoint” sets. We 
conclude here that union of two disjoint sets has no common elements. 


Union with subset 


The set “B” consists of all elements of its subset “A”. In other words, the 
elements of a subset “A” also belongs to the set “B”. The operation of union 
is combining elements of two sets. The union with a subset, therefore, 
consists of elements from both “A” and “B”. However, all elements of “A” 
are also the elements of “B”. Therefore, we find that union set is same as 
the superset “B”. Symbolically, 


If ACB, thn BUA=B. 


We can check this deduction with the help of an example. Let us consider 
two sets as : 


A= {4,5,6} 
B = {1,2,3,4,5,6} 


Here, we see that A C B. Now, 


BUA = {1,2,3,4,5,6,4,5,6} = {1,2,3,4,5,6} = B 


Union with subset 


BWA 


The representation of union with 
subset on Venn diagram 


Multiple unions 


If A), Ao, Az,......... , A,, is a finite family of sets, then their unions, one 
after another, is denoted as : 


A, U ApgU A3U......... U A, 


Important results 


In this section we shall discuss some of the important characteristics/ 
deductions for the union operation. 


Idempotent law 


The literal meaning of the word “idempotent” is “unchanged when 
multiplied by itself”. Following the clue, the union of a set with itself is the 
set itself. This is an equivalent statement conveying the meaning of 
“idempotent” in the context of union. Symbolically, 


AUA=A 


The union set consists of distinct elements and common elements taken 
once. Between two sets here, all elements are common. The union set 
consists of all elements of either set. 


Identity law 


The algebraic operators like addition and multiplication have defined 
identities, which does not change the other operand of the operator. For 
example, if we add “O” to a number, it remains same. Hence, “O” is additive 
identity. Similarly, “1” is multiplicative identity. 


In the case of union, we find that union of a set with empty set does not 
change the set. Hence, empty set is union identity. 


AUp=A 


As there is no element in empty set, union has same elements as that in “A”. 


Law of U 


All sets are subsets of universal set for a given context. We have seen that 
union with subset results in the set itself. Clearly, union of universal set 
with its subset will result in the universal set itself. 


Ui ASS 


Commutative law 


In order to assess whether commutative property holds or not, we consider 
the example, used earlier. Let the sets be : 


AHjHt{12.3.45.6} 
B = {4,5,6,7,8} 
Then, 
AW B=4{1,2;3,4°5,6,4,5,6,7,8} = {1,2,3,4,5,6,7,8} 
BUA=44,5,6,7,8;1,2,3,.4,5,6) = 11.2:3.4°5:6,7,8} 


Thus, we see that order of operands with respect to the union operator is not 
differentiating. We can also appreciate this law on Venn diagram, which 
does not change by changing positions of sets across union operator. 


Associative law 


The associative property also holds with respect to union operator. We 
know that associative property is about changing the place of parentheses as 
here : 


(AUB)UC = AU(BUC) 


The parentheses simply change the precedence of operation. On Venn 
diagram, union involving three sets appears same, irrespective of whether 
we apply union operation in a particular sequence. 

Union of three sets 


Intersection of sets 


We have pointed out that a set representing a real situation is not an isolated 
collection. Sets, in general, overlaps with each other. It is primarily because 
a set is defined on few characteristics, whereas elements generally can 
possess many characteristics. Unlike union, which includes all elements 
from two sets, the intersection between two sets includes only common 
elements. 


Intersection of two sets 
The intersection of sets “A” and “B” is the set of all elements common 
to both “A” and “B”. 


The use of word “and” between two sets in defining an intersection is quite 
significant. Compare it with the definition of union. We used the word “or” 
between two sets. Pondering on these two words, while deciding 
membership of union or intersection, is helpful in application situation. 


The intersection operation is denoted by the symbol, " ™". We can write 
intersection in set builder form as : 
Intersection of two sets 


The intersection set consists of 
elements common to two sets. 


ANB={x: xE€A and «eB 


Again note use of the word “and” in set builder qualification. We can read 
this as “x” is an element, which belongs to set “A” and set “B”. Hence, it 
means that “x” belongs to both “A” and “B”. 


In order to understand the operation, let us consider the earlier example 
again, 


A = {1,2,3,4,5,6} 
B= {4,5,6,7,8} 
Then, 
AN B= {4,5,6} 


On Venn diagram, an intersection is the region intersected by circles, which 
represent two sets. 
Intersection of two sets 


The intersection set consists of 
elements common to two sets. 


Interpretation of Intersection set 


Let us examine the defining set of intersection : 
ANB={a«: «EA and «EB 


We consider an arbitrary element, say “x”, of the intersection set. Then, we 
interpret the conditional meaning as : 


If cE€ANBsxeEeA and «EB. 
The conditional statement is true in opposite direction as well. Hence, 
IfeeEeA and cxEBsSexeAnNeB. 
We summarize two statements with two ways alrow as : 
rEANBSaxeA and «EB 


In addition to two ways relation, there is an interesting aspect of 
intersection. Intersection is subset of either of two sets. From Venn diagram, 
it is clear that : 

Intersection of two sets 


The intersection set consists of 
elements common to two sets. 


(AN B)CA 
and 


(AN B)cCB 


Intersection with a subset 


Since all elements of a subset is present in the set, it emerges that 
intersection with subset is subset. Hence, if “A” is subset of set “B”, then: 


BONA=A 


Intersection of disjoint sets 


If no element is common to two sets “A” and “B” , then the resulting 
intersection is an empty set : 


ANB=po 


In that case, two sets “A” and “B” are “disjoint” sets. 


Multiple intersections 


li Ajp As Agius eis , An is a finite family of sets, then their 
intersections one after another is denoted as : 


Ai Ag A3N....... NAn 


Important results 


In this section we shall discuss some of the important characteristics/ 
deductions for the intersection operation. 


Idempotent law 


The intersection of a set with itself is the set itself. 
AVA =A 


This is because intersection is a set of common elements. Here, all elements 
of a set is common with itself. The resulting intersection, therefore, is set 
itself. 


Identity law 


The intersection with universal set yields the set itself. Hence, universal set 
functions as the identity of the intersection operator. 


ANU=A 


It is easy to interpret this law. Only the elements in "A" are common to 
universal set. Hence, intersection, being the set of common elements, is set 
WAM 


Law of empty set 


Since empty set is element of all other sets, it emerges that intersection of 
an empty set with any set is an empty set (empty set is only common 
element between two sets). 


pNA=p 


Commutative law 


The order of sets around intersection operator does not change the 
intersection. Hence, commutative property holds in the case of intersection 
operation. 


ANB=BNA 


Associative law 
The associative property holds with respect to intersection operator. 
(ANB)NC=AN(BNC) 


The intersection of sets “A” and “B” on Venn’s diagram is : 
Intersection of two sets 


The intersection is a set of 
common elements and shown as 
colored region. 


In turn, the intersection of set “A M B” and set “C” is the small region in the 
center : 


Intersection inloving three sets 


Intersection of a set with "the 
intersection set of two sets" 


It is easy to visualize that the ultimate intersection is independent of the 
sequence of operation. 


Distributive law 


The intersection operator( M ) is distributed over union operator ( U ) : 
AN (BUC) = (ANB)U(ANC) 


We can check out this relation with the help of Venn diagram. For 
convenience, we have not shown the universal set. In the first diagram on 
the left, the colored region shows the union of sets “B” and “C” ie. BUC’. 
The colored region in the second diagram on the right shows the 
intersection of set “A” with the union obtained in the first diagram i.e. 
BUC. 

Distributive law 


LHBYCc 2:A\ (BUC) 


e 3e9 


BUC Cc AM (BUC) Cc 


Distribution of intersection 
operator over union operator 


We can now interpret the colored region in the second diagram from the 
point of view of expression on the right hand side of the equation : 


AN (BUC) = (ANB)U(ANC) 


The colored region is indeed the union of two intersections : "_A U B" and" 
AUC". Thus, we conclude that distributive property holds for 
"intersection operator over union operator". 


In the same manner, we can prove distribution of “union operator over 
intersection operator” 


AU (BNC) = (AUB)N (AUC) 


Analytical proof 


Distributive properties are important and used for practical application. In 
this section, we shall prove the same in analytical manner. For this, let us 
consider an arbitrary element “x”, which belongs to set" AM (BUC)": 


xe AN(BUC) 
Then, by definition of intersection : 

=>xzEA and «eE(BUC) 

=xEA and («x€B or «rEC) 

=>(rE€A and «t€B) or («EA and rEC) 

=>(xEANB) or (x E€ ANC) 
=>2z2€E(ANB) or (ANC) 
>2€(ANB)U(ANC) 


But, we had started with "AM (BU C) " and used its definition to show 
that “x” belongs to another set. It means that the other set consists of the 
elements of the first set — at the least. Thus, 


= AN(BUC) c (ANB)U(ANC) 


Similarly, we can start with "(AM B)U(ANMC) " and reach the 
conclusion that : 


=> (AN B)U(ANC) CAN(BUC) 
If sets are subsets of each other, then they are equal. Hence, 
= AN(BUC) =(ANB)U(ANC) 


Proceeding in the same manner, we can also prove other distributive 
property of “union operator over intersection operator” : 


AU (BNC) = (AUB)N (AUC) 


Difference of sets 


We can extend the concept of subtraction, used in algebra, to the sets. If a set 
“B” is subtracted from set “A”, the resulting difference set consists of elements, 
which are exclusive to set “A”. We represent the symbol of difference of sets as 
“A-B” and pronounce the same as “A minus B”. 


Difference of sets 
The difference of sets “A-B” is the set of all elements of “A”, which do not 
belong to “B”. 


In the set builder form, the difference set is : 

A-B={zx#: «EA and z¢€B} 
and 

B-A={zx: «¢€B and z¢€A} 


On Venn’s diagram, the difference "A-B" is the region of “A”, which excludes 
the common region with set “B”. 
Difference of two sets 


The difference of two sets is a 
disjoint set. 


Interpretation of difference set 
Let us examine the defining set of intersection : 
A-B={zx: «¢€A and z¢B} 


We consider an arbitrary element, say “x”, of the difference set. Then, we 
interpret the conditional meaning as : 


If «E€A-BSxeEeA and «EB. 

The conditional statement is true in opposite direction as well. Hence, 
If «E€A and c€BS>xEcA-B. 

We can summarize two statements with two ways arrow as : 


If cE€A-BsxeaA and «¢€B. 


Composition of a set 


From Venn’s diagram, we observe that if we derive union of (A M B) to either of 
the difference sets, then we get the complete individual set. 
Difference of two sets 


AS\B 


The difference of two sets is a 
disjoint set. 


A=(A-B)U(ANB) 
and 


B=(B-A)U(ANB) 


Difference of sets is not commutative 


The positions of sets about minus operator affect the result. It is clear from the 
figure above, where “A-B” and “B-A” represent different regions on Venn’s 
diagram. As such, the difference of sets is not commutative. Let us consider the 
example used earlier, where : 


A = {1,2,3,4,5,6} 


A = {4,5,6,7,8} 
Then, 
=> A-— B= {1,2,3} 
and 
=> B-—A= {7,8} 
Clearly, 


Sh BS Ba A 


Symmetric difference 


From the Venn’s diagram, we can see that union of two sets is equal to three 
distinct regions. Alternatively, we can say that the region represented by the 
union of two sets is equal to the sum of the regions representing three “disjoint” 
sets (i) difference set A-B (ii) intersection set "A ™ B" and (iii) difference set 
B-A. 


Difference of two sets 


AS\B 


The difference of two sets is a 
disjoint set. 


We use the term “symmetric set” for combining two differences as marked on 
Venn’s diagram. It is denoted as “ AAB”. 


AAB = (A—B)U(B-—A) 


Complement of a set 


The complement is a special case of the difference operation. The set in question 
is subtracted from universal set, “U”. Thus, one of the sets in difference 
operation is fixed. We define complement of a set as its difference with 
universal set, "U". The complement of a set is denoted by the same symbol as 
that of set, but with an apostrophe. Hence, complement of set A is set A’. 


Complement of a set 
The complement of a set “A” consists of elements, which are elements of 
“U”, but not the elements of “A”. 


We write the complement set in terms of set builder form as : 


Al={xz: x€E€U and «¢ A} 


Note that elements of A’ does not belong to set “A”. On Venn’s diagram, the 
complement of “A” is the remaining region of the universal set. 
Complement of a set 


The complement of a set is the 
remaining region of the universal 
set. 


Interpretation of complement 


Proceeding as before we can read the conditional statement for the complement 
with the help of two ways arrow as : 


xEAlke>xeU and «t€éA 
In terms of minus or difference operation, 
Al=U—-—A 


It is clear from the representation on Venn’s diagram that the universal set 
comprises of two distinct sets — set A and complement set A’. 


= =A Ap 


Compliment of universal set 


The complement of universal set is empty set. It is so because difference of 
union set with itself is the empty set (see Venn's diagram). 


Ui={x: x2EU and «€U}=o 


Complement of empty set 


The complement of the empty set is universal set. It is so because difference of 
union set with the empty set is universal set (see Venn's diagram). 


gl={x: «EU and cr€éy}=U 


Complement of complement set is set itself 


The complement of complement set is set itself. The complement set is defined 
as: 


Al=U-—A 

Now, complement of complement set is : 

= (Ani= (U — A) 
Let us consider the example, where : 

US11,2,34.5,6, 0,8} 

A= 11.2;349,6: 

Then, 
=> Al= {1,2,3,4,5,6,7,8} — {1,2,3,4,5,6} = {7,8} 

Again taking complement, we have : 


(AN)i= {1,2,3,4,5,6,7,8} ry {7,8} = {1,2,3,4,5,6} =A 


Union with complement set 


The union of a set with its complement is universal set : 
AUAl={~: xE€U and xe A}U{x: xeEU and re A}=U 


From Venn’s diagram also, we see that universal set consists of set A and 
component A’. 


U=AUAI 
The two sets on the right side of the equation are disjoint sets. Hence, 


AWA 


Intersection with complement set 


There is nothing common between set A and its component A’. Thus, 
intersection of a set with its complement yields the empty set, 


AN Al= 


De-morgan’s laws 


In the real world situation, we want to negate a condition of incidence. For 
example, consider a class in the school. Some students play either basketball or 
football or both, but there are students, who play neither basketball nor football. 
We have to identify later category of students as a set. 


Let the set of students playing basketball be “B” and that playing football be 
“F”, Then, students who do not play basketball is complement set B’ and 
students who do not play football is complement set F’. We have shown these 
complement sets separately for visualization. Actually, these complement sets 
are drawn to the same universal set, "U". 


Two complement sets are but overlapping sets. There are students in the set B’ 
who play football and there are students in the set F’, who play basketball. In 
order to remove those students playing other game, we intersect two 
complements. The members of the intersection of two complements, therefore, 


represent students who play neither basketball nor football. This intersection is 
shown as third bottom Venn’s diagram in the figure. 
Intersection 


(BUFY 


Intersection of two component 
sets 


Looking at the intersection of two complement sets, however, we observe that 
this is equal to the complement of union “ BU F'”. This conclusion can be 
derived from basic interpretation as well. We know that union “ BU F'” 
represents students, who play either or both games. The component of the union, 
therefore, represents, who neither play basketball nor football. 


This fact, as a matter of fact, is the first De-morgan’s law. Symbolically, 
BinFl= (BUF): 
The second De-morgan’s law is : 


BIUFI= (BOF) 


In the parlance of illustration given earlier, let us interpret right hand side of the 
second De-morgan's law. The intersection “ BM F ” represents students playing 
both games. Its complement, therefore, represents students who do not play both 
games, but may play one of them. 

Component set 


(B /\ FY’ 


Component of intersection of two sets 


Analytical proof 


Here, we shall prove first De-morgan’s law in this section. The second law can 
be proved in similar fashion. Let us consider an arbitrary element “x” belonging 
toset( AUB)’. 


ze(AUB) 
=>az¢ (AUB) 
Then, by definition of union, 
=Seé{e: weA or 2€B} 
Here, “not or” is interpreted same as “and”, 
=>x€éA and «t€¢éB 


>272E€Al and xeEBi 
=>2€AMBI 


But, we had started with (A U B )’ and used its definition to show that “x” 
belongs to another set. It means that the other set consists of the elements of the 
first set — at the least. Thus, 


(AU B)ic AINBr 

Similarly, we can start with A/).B/ and reach the conclusion that : 
AINBIc (AUB) 

If sets are subsets of each other, then they are equal. Hence, 


AINBI= (AU B)r 


Example 


Problem 1: In the reference of students in a class, the set “B” represents 
students, who play basketball. The set “F” represents students, who play 
football. The set “B” and “F” are left and right circles respectively on the Venn's 
diagram shown below. Identify regions marked 1 to 8 on the Venn’s diagram. 
Also interpret regions identified by combination U — (6+7). 

sets 


8 


Interpreting sets 


Solution :The meaning of regions market 1 — 8 are as given hereunder : 


1: B-F: It represents the difference of “B” and “F”. It consists of students, who 
play basketball, but not football. 


2: F-B: It represents the difference of “F” and “B”. It consists of students, who 
play football, but not basketball. 


3: BF: It represents the intersection of two sets. It consists of students, who 
play both basketball and football. 


4: B: It represents the set “B”. It is union of two disjoint sets “B-F” and “ 
BUF”. It consists of students, who play basketball. 


5: F: It represents the set “F”. It is union of two disjoint sets “F-B” and“ BO F 
”. It consists of students, who play football. 


6 : BUF: It represents the union set of set “B” and “F”. Equivalently, it is union 
of three disjoint sets “B-F”, “ BM F’” and “F-B”. It consists of students, who 
play either of two games or both. 


7:( BUF)’: It represents the component of union set “ BU F ”. It consists of 
students, who play neither basketball nor football. 


8: (B- F)U(F — B): It represents union of two disjoint difference sets “B- 
F” and “F-B”. It consists of students, who play only one game. 


The region, identified by U — (6+7), is complement of “ BM F'”. It represents 
students, who do not play both games, but may play one of them. 


Working with two sets 


There are finite numbers of elements in finite set. This allows us to analyze 
numbers of elements in different sets that results from the operations carried 
on them. In this module, we shall study different operations on sets in the 
context of practical applications. However, we shall limit ourselves to the 
interaction, involving two sets. The interaction, involving three sets, will be 
dealt in a separate module. 


We use a specific notation to represent the numbers of elements in a set. For 
example, the numbers in set "A" is represented as "n(A)", whereas we 
denote numbers of elements in the union as "n( AU B)". 


Elements in the union of two sets 


The area, demarcated with solid line, in the Venn’s diagram, shows the 
union of two sets denoted by ( A U B ). We want to know the numbers of 
elements in this union in terms of numbers of elements in individual sets. 
Union of two sets 


The common elements in the 
union set is counted only once. 


The sum of the numbers in the individual sets is generally greater than the 
numbers in the union. The reason is that union includes common elements 
only once. On the other hand, sum of the numbers of individual sets counts 
common elements once with each set — in total two times. Clearly, it is 
required that we deduct the numbers of elements, which are common to 
each set, from the sum of numbers of elements in individual sets. Hence, 


n(AU B) = n(A) + n(B) — n(AN B) 


Here, n( A B ) represents the numbers of elements common to two sets. 
As reminder only, we note that plus (+) operation is not a valid set 
operation. We, however, use this algebraic operation here as we are now 
dealing with the numbers in set - not the set. 


Alternatively, we can approach this expansion in yet another way. See the 
representation of intersection of two sets. The union of two sets can be 
considered to comprise of three distinct regions. Three regions shown with 
different colors represent three “disjoint” sets. Clearly, 

Union of two sets 


The region representing union 
consists of three distinct or 
disjointed region. 


n(AU B) =n(A— B)+n(An B)4+n(B-— A) 


However, we observe that if we add n( A U B ) to either of the two 
difference sets, then we get the complete individual set. 


n(A) = n(A—B)+n(AnB) 
= n(A—B) =n(A) —n(ANB) 
and 
n(B) = n(B— A) +n(AnB) 
= n(B— A) = n(B) —n(ANB) 


Substituting for the numbers of the difference set in the equation for the 
numbers in the union set, we have : 


=> n(AU B) = n(A) — n(AN B) + n(AN B) + n(B) —- n(AN B) 
=> n(AU B) = n(A) + n(B) — n(An B) 


Numbers of elements in the union of “disjoint” sets 


Since there are no common elements between two disjoint sets, the 
intersection between disjoint sets is an empty set. Hence, 


n(AU B) = n(A) +n(B) 


Application 


Application of set theory to real situation is keyed to the interpretation of 
wordings and description. In order to efficiently employ the concepts of set 
theory to real world situations, we need to interpret description of collection 
appropriately. 


Once collections are interpreted correctly, rest is easy. There are indeed 
fewer mathematical operations involved here. Most of these relate to 
determination of numbers of elements in a set. In this section, we shall first 
recapitulate or reinterpret different collections and then work with few 
representative situations for analysis (if we are confident then we can skip 
the recapitulation part). 


Set 


In real situation, we identify a collection with certain characteristic common 
to elements. For example, a set of students in a class is based on the 
characteristic that each student is member of that class. This type of 
interpretation, however, is generally restrictive and leads to 
misinterpretation. We tend to think that the collection is isolated in itself, 
which is obviously wrong. 


We need to free our mind from thinking set as an isolated entity. Some of 
the students might be members of another collection like that of basketball 
team, whereas some others might be members of a particular house, say 
“Amity house” and so on 


In the nutshell, we consider set as a collection, which has multiple 
intersections with other collections. 
A set 


A set has multiple intersections 
with other collections. 


Example 


Problem 1: In the house of total 200 students, 140 students play basketball 
and 80 students play football. Each student of the house plays at least one of 
these two games. How many students play both basketball and football? 


Solution : The individual sets here are students playing basket ball (B) and 
football (F). Hence, 


n(B) = 140 
n(F’) = 80 


Clearly, there is no bar that a students playing basketball can not play 
football. This is also evident from the sum of the numbers in each set. The 
sum is 140 + 80 = 220, whereas total numbers of students in the house is 
200 only. Thus, there are students who play both games. We can interpret 
the total numbers as the union of two individual sets. Hence, applying 
expansion for the numbers of a union : 


n(AU B) = n(A) + n(B) — n(AN B) 


The students who play both games constitute the intersection of two 
individual sets. 


Putting values, 


= n(BU F) = 140 + 80 — 200 = 20 


Universal set and complement 


Universal is inclusive of all related sets. If we observe the Venn’s diagram 
consisting of two individual sets, then we realize that largest closed region 
within the universal set is the union involving two sets i.e (AUB). This 
union, however, is a subset of U. There is remaining area within the 
universal set, which is called the component of this union. 


Now we know that a union represents elements which belong to either set 
exclusively or belong commonly with other sets. It means that the 
complement of union represents the region, which can not be defined by the 
characterizing criteria of the union. This complement of union, therefore, 
represents situations which is described in terms of “neither or nor” type. 
Actually, this set is given by De-morgan’s first law. 


Example 


Problem 2: In a house of total 200 students, 100 students play basketball, 
60 students play football and 20 play both games. How many students play 
neither basketball nor football? 


Solution : We have already discussed that “neither nor” condition is same 
as that of De-morgan’s first law : 


n(BINF) =n(BUFY)s 
Now expanding the right hand term, we have : 
=> n(BINF!) =n(BN Fy= U —n( BUF) 
Further using formula for the numbers in a union, 
=> n(BINF) =U — n(B) —- n(F)+ (BN F) 
Putting values, 


= n(BINF?) = 200 — 100 — 60 + 20 = 60 


This is the required answer. However, there remains a question : why do we 
consider total numbers of students as the numbers in universal set, “U”, 
unlike previous example in which this number corresponds to numbers in 
the union of individual sets. Remember, earlier question had the phrase 
“Each student of the house plays at least one of these two games”. This 
ensured that total numbers represented the union as everyone was playing 
one of two games. Such restriction is not there in this example. In fact, we 
saw that there are students who are not playing either of two games at all! 
Thus, total number represents universal set in this example. 


Union 


Union of two sets “A” and “B” conveys the meaning of consisting three 
categories of elements (i) elements exclusively belonging to “A” (ii) 
elements exclusively belonging to “B” and (iii) (i) elements commonly 
belonging to “A” and “B”. In totality, we see that union conveys the 
meaning of “or” — the elements may belong either to a particular set or to 
both sets. 


Example 


Problem 3: In a group of students, 40 students study either English or 
Mathematics. Of these 25 students study Mathematics, 10 students study 
both Mathematics and English. How many students study English? 


Solution : The word “or” in the first sentence indicates that union of 
students studying Mathematics (M) or English (E) or both is 40. Using 
formula, we have : 


n(M U FE) =n(M) + n(£) - n(MN £) 
=> nF) =n(MU £) —-n(M)4+n(Mn £) 


Putting values, 


= n(E) = 40 —25 +10 = 25 


Difference 


In the case of intersection of two sets, we have noted that difference 
represents the exclusive or isolated set, which is not common to other set. 
From the Venn’s diagram, we also observe that a given set is actually 
composed of two sets (i) difference set and (ii) intersection set. 


n(A) = n(A— B)+n(AN B) 
and 


n(B) = n(B-— A)+n(An B) 


Example 


Problem 4: In a house of 200 students, 120 students study Mathematics, 60 
students study English and 40 students study both Mathematics and 
English. Find in the house : (i) students who study Mathematics but not 
English (ii) students who study English, but not Mathematics (iii) students 
who study either Mathematics or English and (iv) students who neither 
study Mathematics nor English. 


Solution : Let us first characterize collections as given in the question. Two 
sets are given one for those who study Mathematics (M) and other for those 
who study English(E). The addition of numbers of individual sets is 120 + 
60 = 180, which is less than total numbers of students. Hence, total numbers 
of 200 corresponds to universal set. Here, 


U =200; n(M)=120; n(£)=60 and n(MNE) = 40. 


(i) Students studying Mathematics, but not English means that we need to 
find the numbers in the difference of set i.e M—E. 


n(M — FE) =n(M)-n(MNn £) 


= n(M — E) = 120 — 40 = 80 


(ii) Students studying Mathematics, but not English means that we need to 
find the numbers in the difference of set i.e E—M. 


n(f — M)=n(FE)—-—n(Mn E) 
= n(E—M) = 60 — 40 = 20 


(iii) Students who study either Mathematics or English is equal to the 
numbers in the union of two sets. 


n(M U E) =n(M) + n(£) - n(MN £) 
Putting values, 
=> n(M U E) = 120+ 60 — 40 = 140 


(iv) Students who study neither Mathematics nor English is equal to the 
numbers in the compliment of the union of two sets. 


n(M U E)l= U —n(MU EB) = 200 — 140 = 60 


Working with three sets 


Working with three sets is similar as working with two sets. The underlying characteristics of set 
operations are same. The union and intersection of sets are carried out with three sets - one after 
another. Notably, union and intersection operations are commutative. This allows us to extend these set 
operations to third set in any sequence. Venn’s diagram enables us to visualize resulting set. In this 
module, we shall first formulate expression for the numbers in the union of three sets. Subsequently, 
we shall apply the formulation to real time analysis - using both graphical (Venn diagram) and 
analytical methods. 


The union, involving three sets, can be considered in terms of union of a set with "union of other two 
sets". In that sense, union of three sets represent elements which belong to either of three sets. Here, 
we want to find the expression of numbers of elements in the union set, which is represented as : 


n(AUBUC) 


Before, we work on the expansion of this term, let us first find out what does the term" AU BUC" 
represent on Venn’s diagram? The figure below shows the representation of this term : 
Union of three sets 


The union set represent elements 
of three sets combined together. 


The set shown in the figure above consists of following class of elements : 


1. The elements, which are exclusive of sets “A”, “B” and “C” respectively. 
2. The elements, which are common to a pair of two sets at a time. 
3. The elements, which are common to all three sets. 


In the figure below, the common areas between a pair of two sets are marked “1”, “2” and “3”. The 
common area among all three sets is marked “4”. 
Union of three sets 


The union consists of disjointed 
regions. 


Union of three sets 


As discussed earlier, the sum of numbers of individual sets is greater than the number of elements in “ 
AU BUC”, unless the sets are disjoint sets. It is imperative that we account for the repetition of 
common elements. Proceeding as in the case of union of two sets, we deduct the intersections between 
each pair of sets as : 


n(AU BUC) = n(A) + n(B) + n(C) — n(AN B) —- n(ANC) -n(BNC) 


In this manner, we account for common elements between two sets. However, we have deducted 
elements "common to all three sets" in this process — three times. On the other hand, the elements 
"common to all three sets" are present in the numbers of each of the individual sets - in total three 
times as there are three sets. Ultimately, we find that we have not counted the elements common to all 
sets at all. It means that we need to account for the elements common to all three sets. In order to add 
this number, we first need to know — what does this common area (marked 4) represent symbolically? 


In the earlier module, we have seen that the area marked “4” is represented by “ AM BN C'”. Hence, 
the correct expansion for the numbers of elements in the union, involving three set, is : 


=>n(AU BUC) = n(A) + n(B) + n(C) — n(AN B) — n(ANC)-—n(BNC)+n(AN BNC) 


Note: This result is an important result as the same is used while studying probability. 


Union of three sets (Analytical method) 


We can achieve this result analytically as well. Here, we consider “A” as one set and “ (BU C) “as 
other set. Then, we apply the relation, which has been obtained for the numbers in the union of two 
sets as : 


n(AU BUC) =n(A) + n(BUC)—n[AN (BUC) 
Applying result for the union of two sets for “n(BU C) ”, we have : 
n(BUC) =n(B)4+n(C) -—n(BNC) 


Putting in the expression for “n(AU BUC)”, 


=> n(AU BUC) = n(A) + n(B)+n(C)-—n(BNC)—nl[AN (BUC) 


At this stage, our task is to evaluate “n|AM (BU C)] ”, Recall that we have worked with the 
distributive property of “intersection operator over union operator”. Following distributive property, 


=> n[AN(BUC)] =n[(ANB)U(ANC)| 


We can treat each of the terms in the small bracket on the right hand side of the above equation as a 
set. Applying relation obtained for the numbers in the union of two sets again, we have : 


=> n[AN (BUC) =n(AUB) + n(AUC) —n[(AN B)N(ANC)] 
The last term in above equation is : 
(AN B)N (ANC) =(ANBNC) 
Hence, 
=> n[AN(BUC)] =n(AUB) +n(AUC) —n[(ANBNC)] 


Now, putting this expression in the expression of the numbers in the union involving three sets and 
rearranging terms, we have : 


=>n(AU BUC) = n(A)+ n(B) + n(C) — n(AN B) — n(ANC)—n(BNC)+n(AN BNC) 


In the nutshell, we find that numbers of elements in the union, here, is equal to the sum of numbers in 
the individual sets, minus elements common to two sets taken at a time, plus elements common to all 
three sets. 


Illustration 


In this section, we shall work with an example, which is quite intuitive of the analysis, involving three 
sets. We shall see that analysis of set operations in terms of Venn’s diagram is very direct and simple. 
As such, we shall first attempt analyze situation with Venn’s diagram. 


However, we need to emphasize that extension of set concepts to calculus, probability and other 
branches of mathematics require that we develop analytical skill with respect to set operations. 
Keeping this aspect in mind, we shall also work the solution, using analytical method. 


Problem : In a town, a total 100000 people read newspaper. Out of these, 40 % read newspaper “A”, 
30 % read newspaper “B”, 10 % read newspaper “C”. It is found that 5% read both “A” and “B”; 4% 
read both “A” and “C”; and 3% read both “B” and “C”. Also, 2% of the people read all three 
newspapers. Find numbers (i) who read only “A” (ii) who read only “B” (iii) who read neither of three 
newspapers. 


We define three sets “A”, “B” and “C”, corresponding to people reading newspapers “A”, “B” and “C” 
respectively. From question, we have : 


n(A) = 0.4X100000 = 40000 
n(B) = 0.3X100000 = 30000 
n(C) = 0.1X100000 = 10000 
n(AN B) = 0.05.X100000 = 5000 
n(AN C) = 0.04.X100000 = 4000 
n(BOC) = 0.03X100000 = 3000 
n(AN BOC) = 0.2X100000 = 2000 


Venn's diagram method 


We observe that sum of the individual sets is less than total numbers of people , reading newspaper, in 
the town. Hence, total reading population represents universal set, “U”. The representations of these 
sets are Shown on Venn’s diagram. Note that we have split the elements common to a pair of two sets 
in two parts (a) elements exclusive to intersection of two sets and (b) elements common to all three 
sets. 

Union of three sets 


3000 


‘\ 
ae) 000 


AV BVC 


The regions common to sets. 


From the diagram, 


(i) The required set is the region of “A” not common to “B” and “C”. This region represents elements, 
which are exclusive to set “A”. Thus, numbers of people reading only “A” is : 


n; = 40000 — (3000 + 2000 + 2000) = 33000 


(ii) The required set is the region of “B” not common to “A” and “C”. This region represents elements, 
which are exclusive to set “B”. Thus, numbers of people reading only “B” is : 


nz = 30000 — (3000 + 1000 + 2000) = 24000 


(iii) The required set is the remaining region of universal set “U” i.e. complement of the union of three 
sets. Now, proceeding as before, people who read newspaper “C” only is (see Venn’s diagram above): 


nz = 100000 — (2000 + 1000 + 2000) = 5000 


Hence, the required number is : 
Union of three sets 


\ 
c AVBVC 


The regions representing people, 
who read neither of three 
newspapers. 


n4 = 100000 — (33000 + 24000 + 5000 + 3000 + 2000 + 1000 + 2000) 
=> n4 = 30000 


Analytical method 

(i) Here we are required to find the numbers of people reading only “A”. It is clear that this set is part 
of the people, who do not read newspapers “B” and “C”. As discussed in the case of two sets, the 
numbers of people who read neither “B” and “C” is given by De-morgan’s first equation, 


BINCI= (BUC): 


Intersection of two complement sets 


The region representing people, 
who read neither of two 
newspapers. 


For clarity, we have shown this region in the Venn’s diagram. We should realize that this intersection 
of two sets also includes people who read newspaper “A”. However, we are required to know numbers 
of people, who read newspaper “A” only. The exclusion people reading other newspaper as well are 
not part of our required set. 


The remaining set (refer Venn’s diagram) represent area consisting of people who exclusively read 
newspaper “A” or who do not read any of three newspapers. Now, we need the intersection of set A 
with the remaining region to obtain the numbers, who read only newspaper “A”. Hence, required 
number is : 


ny = n(AN BINC!) 
Using De-morgan's equation, 
ny = n(AN BINC) = n[AN (BUC) 
>n =n[An|[U —-(BUC)]] =n[(ANU)-AN(BUC)]] 
>n=n|A-AN(BUC)] =n(A)—n[AN (BUC) 
Using distributive property of intersection over union, 
=>n, = n(A)—n[(AN B)U(ANC)] 
Using formula of expansion of union of two sets, 
=>n = n(A) — [n(AN B) + n(ANC)-—n[(AN B)N(ANC)]] 
=> nm = n(A) — [n(AN B)+n(ANC)-—n(AN BNC) 
We see that values of each term on the right hand side are given. Putting these values, 


= n, = 40000 — [5000 + 4000 — 2000] = 33000 


(ii) Here we are required to find the numbers of people reading only “B”. Proceeding as before, 
ng = n(B) — [n(BN A) + n(BNC)—-n(BN ANC)| 
=> ng = n(B) — [n(AN B)+n(BNC)-n(ANBNC)] 
Putting values, 
= nz = 30000 — [5000 + 3000 — 2000] = 24000 


(iii) In order to find the numbers of people, who read neither of three newspapers, we first find union 
of three sets. The union represents people who read either of these newspapers — one, two or all three. 
Clearly, people, who do not read either of these papers constitute a complement of this union. 


nz = (AUBUC)I=U-(AUBUC) 
Using expansion for the numbers in the union of three sets, 


=>n3 =U — [n(A) + n(B) + n(C) — n(AN B) —- n(ANC) —n(BNC)+n(AN BNC) 


Putting values, we have : 


= nz = 100000 — [40000 + 30000 + 10000 — 5000 — 4000 — 3000 + 2000] = 30000 


Cartesian product 


We have seen that set operations convey the notion of arithmetic operations. One such similar 
operation is product of two sets called “Cartesian product”. Since sets are collection — not a 
single quantity, the product operation here involves combining or pairing each of the elements of 
one set with that of another set. 


We use symbol “X” to denote product operation. The Cartesian product of two sets “A” and “B” 
is symbolically represented as : 


AxB 


It is important to understand that we do not multiply elements as we do in arithmetic — instead 
we pair elements together. This is the meaning of “product” for the sets. We denote one such 
pair within a pair of small brackets like : 


(a,b) 
wherea € Aandbé B. 


Note that elements from two sets are separated by comma. 


Ordered pair 


The order of pairing is important. The pair (a,b) and (b,a) are different. This ordering is required 
as there are real time situations, where order makes a difference. Consider for example, we are 
required to find the integers which can be formed from two integer subsets like {1,2,3} and 
{3,4,5}. Clearly, “13” and “31” represent different integers. We need to distinguish them. All 
pairs formed from two sets should be distinct. 


Keeping this restriction in mind, let us work out an example to find ordered pairs formed from 
elements of two sets. 


A = set of first letter of the names of cities = {N, D, H} 
B = set of numbers denoting flight numbers = {001,002,003} 
All possible ordered pairs formed from two sets are : 
(NV, 001), (N’, 002), (NV, 003), (D, 001), (D, 002), (D, 003), (H, 001), (H, 002), (H, 003) 


There are all together 9 ordered pairs. From this example, we can deduce a method for writing 
ordered pairs from two sets. We begin with the first elements of two sets. Progressively, we 
change the elements from the second set till it is exhausted, while keeping the elements from the 
first set unchanged. Then, we switch to “next” element from first set and start with “first” from 
the second set. Again, we change the elements from the second set progressively till it is 
exhausted, while keeping the elements from the first set unchanged. We continue in this manner 
till all elements from the first set is also exhausted. 


From this discussion, it is also evident that two ordered pairs are equal if and only if the 
corresponding first and second elements are equal. 


Cartesian product 
The Cartesian product of two sets is defined in terms of ordered pairs. 


Cartesian product 
The Cartesian product of two non-empty sets “A” and “B” is the set of all ordered pairs of 
the elements from two sets. 


We should emphasize the use of word “non-empty”, The Cartesian product of a non-empty set 
with an empty set is equal to empty set. 


Axp=p 
On the other hand, if one of the sets is infinite, then resulting Cartesian product is also infinite. 
We express the Cartesian product set in set building form as : 
Ax B={(z,y): «weA,ye B} 


Here, use of "," in the set builder form is equivalent to "and". Therefore, we can write Cartesian 
product of two sets also as : 


Ax B={(z,y): «weAand ye B} 


Further, we can emphasize two ways validity of the conditional statements as in the case of other 
set operators : 


If(z,y)e AxBexecAandyeB 


Graphical representation 


The ordered pairs can be represented in the form of tabular cells or points of intersection of 
perpendicular lines. The elements of one set are represented as rows, whereas elements of other 
set are represented as columns. Look at the representation of ordered pairs by points in the 
figure for the example given earlier. 

Cartesian product 


The elements of one set 
are represented as rows, 
whereas elements of 
other set are represented 
as columns. 


Note that there are a total of 9 intersection points, corresponding to 9 ordered pairs. 


Examples 


Problem 1 : If (2? —lyt 2) = (0,2) , find “x” and “y”. 


Solution : Two ordered pairs are equal. It means that corresponding elements of the ordered 
pairs are equal. Hence, 


Se =1=0 
=>2x=1 or —-1 
and 
Sy+-2=2 
= 70 


Problem 2 : If A = {5,6,7,2}, B={3,5,6,1} and C = {4,1,8}, then find (AM B) x (BNC). 


Solution : In order to evaluate the given expression, we first find out the intersections given in 
the brackets. 


= ANB = 15,6). 
BOC =41} 
Thus, 
(AN B) x (BNC) = {(6,1), (5,1)} 


Note that the elements in the given set are not ordered. It is purposely given this way to 
emphasize that order is requirement of ordered pair — not that of a set. 


Numbers of elements 


We have seen that ordered pairs are represented graphically by the points of intersection. The 
numbers of intersections equal to the product of numbers of rows and columns. Thus, if there are 


66.99 


p” elements in the set “A” and “q” elements in the set “B”, then total numbers of ordered pairs 
are “pq”. In symbolic notation, 


n(A x B) = pq 


Multiple products 


Like other set operations, the product operation can also be applied to a series of sets in 
sequence. If A; Ao ........ , An is a finite family of sets, then their Cartesian product, one after 
another, is symbolically represented as : 


This product is set of group of ordered elements. Each group of ordered elements comprises of 
“n” elements. This is stated as : 


Ay x Ao Mile x An = {(21,02,...,%n) : £1 € Agee € Ao,...,2n € An} 


Ordered triplets 
The Cartesian product A x A x A is set of triplets. This product is defined as : 
AxAxA={(a,y,z):2,y,2€ A} 


We can also represent Cartesian product of a given set with itself in terms of Cartesian power. In 
general, 


SA SAw A Kacckes x A 
where “n” is the Cartesian power. If n = 2, then 
SV SAA 


This Cartesian product is also called Cartesian square. 


Example 
Problem 3 : If A = {-1,1}, then find Cartesian cube of set A. 


Solution : Following the method of writing ordered sequence of numbers, the product can be 
written as : 


AAR A= -1,=1=1),(-1,11),(=1,1, 1), 
(11,1), Q=1,— 1), 0, 1,0,01,=), 41} 


The total numbers of elements are 2x2x2 = 8. 


Cartesian Coordinate system 


The Cartesian product, consisting of ordered triplets of real numbers, represents Cartesian three 
dimensional space. 


RxRx R={(2;9,2) i ¢,y,2 € RB} 


Each of the elements in the ordered triplet is a coordinate along an axis and each ordered triplet 
denotes a point in three dimensional coordinate space. 
Cartesian coordinate system 


B(4,3,2) 


So Na2.2,4) 


The coordinate of a point is an 
ordered tripplet. 


Similarly, the Cartesian product" R x R" consisting of ordered pairs defines a Cartesian plane 
or Cartesian coordinates of two dimensions. It is for this reason that we call three dimensional 
rectangular coordinate system as Cartesian coordinate system. 


Commutative property of Cartesian product 


The Cartesian product is set of ordered pair. Now, the order of elements in the ordered pair 
depends on the position of sets across product sign. If sets "A" and "B" are unequal and non- 
empty sets, then : 


AxBABxA 


In general, any operation involving Cartesian product that changes the "order" in the "ordered 
pair" will yield different result. 


However, if "A" and "B" are non-empty, but equal sets, then the significance of the order in the 
"ordered pair" is lost. We can use this fact to formulate a law to verify "equality of sets". Hence, 
if sets "A" and "B" are two non-empty sets and 


AxB=BxaA 
Then, 
A=B 


It can also be verified that this condition is true other way also. If sets "A" and "B" are equal 
sets, then A x B = B x A. The two way conditional statements can be symbolically 
represented with the help of two ways arrow, 


AxB=BxASA=B 


Distributive property of product operator 


The distributive property of product operator holds for other set operators like union, 
intersection and difference operators. We write equations involving distribution of product 
operator for each of other operators as : 


Ax (BUC) =(Ax B)U(Ax C) 
Ax (BNC) =(Ax B)N(AxC) 
Ax (B-—C)=(Ax B)-(AxC) 


Here, sets “A”,”B” and “C” are non-empty sets. In order to ascertain distributive property 
product operator over other set operators we need to check validity of the equations given above. 


We can check these relations proceeding from the defining statements. For the time being, we 
reason that sequence of operation on either side of the equation does not affect the “order” in the 
“ordered pair”. Hence, distributive property should hold for product operator over three named 
operators. Let us check this with an example : 


A= {a,b}, B={1,2} and C= {2,3} 
1: For distribution over union operator 
= LHS = A x (BUC) = {a,b} x {1,2,3} 
=> LHS = {(a,1), (a, 2), (a, 3), (b, 1), (b, 2), (b,3)} 
Similarly, 


= RHS = (A x B) U(A x C) = {(a, 1), (a, 2), (6, 1), (6, 2)} U {(a, 2), (a, 3), (6, 2), (6, 3)} 


= RHS = {(a, 1), (a, 2), (a, 3), (b, 1), (b, 2), (b, 3)} 
Hence, 
=> Ax (BUC) =(Ax B)U(AxC) 
2: For distribution over intersection operator 
=> LHS = A x (BNC) = {a,b} x {2} 
—> LHS = {(a,2), (b,2)} 
Similarly, 
=> RHS = (A x B)N(Ax C) = {(a,1), (a, 2), (b, 1), (b, 2)} A {(a, 2), (a, 3), (b, 2), (b, 3)} 
= RHS = {(a, 2), (b,2)} 
Hence, 
> Ax (BNC) =(Ax B)N(AxC) 
3: For distribution over difference operator 
-> LHS = A x (B—C) = {a,b} x {1} 
> LHS = {(a,1), (b,1)} 
Similarly, 
=> RHS = (A x B) — (Ax C) = {(a,1), (a, 2), (b, 1), (, 2)} — {(a, 2), (a, 3), (b, 2), (6, 3)} 
= RHS = {(a, 1), (b, 1)} 
Hence, 


=Ax(B-—C)=(Ax B)-(AxC) 


Analytical proof 


Let us consider an arbitrary ordered pair (x,y), which belongs to Cartesian product set “ 
A x (BUC) ”. Then, 


= (z,y) € Ax (BUC) 
By the definition of product of two sets, 


=az€A and yEe(BUC) 


By the definition of union of two sets, 
=+=nrEA and (eB or yeC) 
=>(xEA and yeB) or (xEA and yEC) 
=>(2,y)€AxB or (#4, y)E€AxC 
By the definition of union of two sets, 
=> (az,y)€ (Ax B)U(AxC) 


But, we had started with ".A x (BUC) " and used definitions to show that ordered pair “(x,y)” 
belongs to another set. It means that the other set consists of the elements of the first set — at the 
least. Thus, 


=>Ax(BUC)C (Ax B)U(AxC) 

Similarly, we can start with " (A x B) U (A x C) "and reach the conclusion that : 
=> (Ax B)U(AxC)cCAx (BUC) 

If sets are subsets of each other, then they are equal. Hence, 
= Ax(BUC)=(Ax B)U(AxC) 


Proceeding in the same manner, we can also prove distribution of product operator over 
intersection and difference operators, 


Ax (BNC) =(Ax B)N(AxC) 
Ax (B-—C)=(Ax B)-(AxC) 


Cartesian Product (exercise) 


Note : The results of some of the questions (3 - 7) are of generic nature. As such, they can 
also be treated as theorems on Cartesian products. 


Worked out exercises 


Problem 1 : Cartesian product "A x B" consists of 6 elements. If three of these are (1,2), 
(2,3) and (3,3), then find Cartesian product set" B x A". 


Solution : We need to know two sets “A” and “B” in order to evaluate" B x A". First 
elements of ordered pairs of "A x B" are elements of set “A”. Hence, “1”,”2” and “3” are 
the elements of set “A”. On the other hand, second elements of ordered pairs of AXB are 
elements of set “B”. Hence, “2” and “3” are elements of set “B”. 


Now, it is given that there are total 6 elements in the Cartesian product, which is equal to 
the product of numbers of elements in two sets i.e. 3 X 2. It means that we have identified 
all elements of sets “A” and “B”. 


A= {1,253} 
Be {23 
Following the rule for writing Cartesian product in terms of ordered pairs, we have : 


= Bik Acs (2) Sle (2,2)3(2ia)asa)elasa)y 


Problem 2 : Two sets are given as : A = {1,2} and B = {3,4}. Find the total numbers of 
subsets of "A x B'". Also write power set of AXB in roaster form. 


Solution : The total numbers of elements in the Cartesian product "A x B" is “pq”, where 
“p” and “q” are the numbers of elements in the individual sets “A” and “B” respectively. 
The all possible subsets that can be formed including empty set and the product" A x B" 


itself is: 


n= 2PY — 9742 — 94 — 16 


Now, the Cartesian product is : 
= Aye B= 113). (4 4),(2,3), (2.4) 4 


The corresponding power set comprises of empty set, 4 sets with elements comprising of 
one element plus 6 sets with elements comprising of two elements taken at a time plus 4 

sets with elements comprising of three elements taken at a time plus set itself. There are 

total of 16 subsets. The power set is set of all subsets as its elements : 


= P(A x B) = tv, {(1,3)}, {,4)}, (2,3) h (2,4), 
{(1,3); (1,4)}, (1,3); (2:3)}, (1,3), (2,4) } (1,4), (2:3), (04), (2,4) (2,3), (2,4) F; 
{(1,3); (1,4), (2,3) }, (1,3), (1,4); (2,4)}, (1,3); (2:3), (2,4) }, (0,4), (2,3), (2,4)} 
{(1,3); (1,4), (2,3), (2,4) } 


It is easy to follow a scheme to write combination in which order is not relevant. We can 
denote each of the ordered pair with a symbol like : 


A B=a.b, 6:0} 


As pointed out for generating combination for ordered pair, we can start with the left 
element and keep changing the last element of the combination till all combinations are 
exhausted. Here, power set in terms of symbols is : 


P(A x B) = {¢, ta}, 18}, te}, {4g}, 
{(a,b), {(4,¢)}, 1a 4} (6,6), 16, OD} (ads, 
{(a, b,c) }, (4,6, d)}, {(4,¢,4)}, 1,6, 4}, 
{(a, b, c,d) }} 


Problem 3 : If" A Cc B" and "C" is any non-empty set, then prove that : 
AxCcBxCd 


Solution : Let us first discuss the logic of the relation here. The elements of set “A” are 
common to set “B”. Now Cartesian product of set "A" with set “C” will yield ordered pairs, 
which are common with the ordered pairs of the Cartesian product "B" with "C". However, 
as set “B” is either larger than or equal to, but not smaller than “A”, it follows that above 
relation should hold. 


Now, we prove the relation analytically. Let an arbitrary ordered pair (x,y) belongs to 
Cartesian product" A x C'". 


(z,y)E€ AxC 
According to definition of Cartesian product, 
=>xeEA and yEC 


But “A” is subset of “B”. Hence, x € B. 


>2zEB and yeCd 
Again, applying definition of Cartesian product, 
(xz,y)EBxC 
This means that : 


SS ASCO CBG. 


Problem 4: If Ac B and CCD, then prove that: 
AxCcBxD 
Solution : Let an arbitrary ordered pair (x,y) belongs to Cartesian product" A x C'". 
(z,y) CAxC 
According to definition of Cartesian product, 
=>xEA and yEC 
But “A” is subset of “B”. Hence, x € B. Also, “C” is subset of “D”. Hence, y € D. 
=>xEéEB and yeEeD 
Again, applying definition of Cartesian product, 
=> (x,y) €CXD 
This means that : 


=>AxCcBxD 


Problem 5 : For any given four sets “A”, “B”, “C” and “D”, prove that : 
(AX BAC Dy (Arr) (BD) 


Solution : Let an arbitrary ordered pair (x,y) belongs to intersection set “ 
(A x B)N(C x D) ”. Then, 


(x,y) € (Ax B)N(C x D) 


Applying definition of intersection, 


=>(2,y)€AxB and (a2,y)ECxD 
Applying definition of Cartesian product, 
=>(tEA and ye B) and (t#EC and yeED) 
=>(xE€A and «£EC) and (yEeB and yEeD) 
Applying definition of intersection, 
=>(xEANC) and (ye BND) 
Again, applying definition of Cartesian product, 
=> (z,y) € (ANC)X(BND) 
This means that : 
=> (Ax B)N(C x D) Cc (ANC) x (BND) 
Similarly, starting from RHS, we can prove that : 
= (ANC) x (BND) Cc (Ax B)N(Cx D) 
If sets are subsets of each other, then they are equal. Hence, 


=> (Ax B)N(C x D) = (ANC) x (BND) 


Problem 6 : Let “A” and “B” be two non-empty sets. If the numbers of common elements 
be “n” between sets “A” and “B”, then find the common elements between Cartesian 
products “A x B” and“ Bx A”. 


Solution : The common elements between sets “A” and “B” is “n”. This means : 
n(ANB)=n 
We are required to evaluate the expression, 
n{(A x B) (Bx A)| 
We have earlier seen that four given sets, 
(A x B)N(C x D) = (ANC) x (BND) 


If we put C = B and D = A in this equation, then expression on the left hand side of the 
equation becomes what is required. 


=(AXx BB) (Bx A) =(ANB)xX(BnA) 
=>n|(A x B)A(Bx A)] =n[(AN B)| x n[(Bn A)] 


=n (Ax B)N(Bx A) =nXn=n? 


Problem 7 : Let “A”, “B” and “C” be three sets. Then prove that : 
AX(BriuCni= (A x B)N (Ax C) 
Solution : From first De-morgan’s theorem, we know that : 
(BU C)= BINC! 

Applying to the LHS, we have : 

=> Ax (BiUCcii= A x [(BNIN(CNI 
Now, component of complement set is set itself. Hence, 

=> Ax (BIUCNI= A x (BNC) 

Applying distributive property of product operator over intersection operator, we have : 


=> Ax (BIUCNI= (A x B)N (Ax C) 


Problem 8 : Let “A”, “B” and “C” be three sets. Then prove that : 
AX(BINCNI= (A x B)U(A x C) 
Solution : From second De-morgan’s theorem, we know that : 
(BN Cyr= Buc 

Applying to the LHS, we have : 

=> Ax (BINCI)I= A x [(BHru(CNh 
Now, component of complement set is set itself. Hence, 

=Ax (BINCHi= Ax (BUC) 


Applying distributive property of product operator over union operator, we have : 


=> Ax (BINCNI= (A x B)U(Ax C) 


Probability Concepts -- Probability 
An introduction to probability and the multiplication rule. 


If you flip a coin, what is the chance of getting heads? That’s easy: 50/50. 
In the language of probability, we say that the probability is 7 That is to 
say, half the time you flip coins, you will get heads. 


So here is a harder question: if you flip two coins, what is the chance that 
you will get heads both times? I asked this question of my son, who has 
good mathematical intuition but no training in probability. His immediate 
answer: +. There are three possibilities: two heads, one heads and one tails, 


and two tails. So there is a x chance of getting each possibility, including 
two heads. Makes sense, right? 


But it is not right. If you try this experiment 100 times, you will not find 
about 33 “both heads” results, 33 “both tails,” and 33 “one heads and one 
tails.” Instead, you will find something much closer to: 25 “both heads,” 25 
“both tails,” and 50 “one of each” results. Why? 


Because hidden inside this experiment are actually four different results, 
each as likely as the others. These results are: heads-heads, heads-tails, 
tails-heads, and tails-tails. Even if you don’t keep track of what “order” the 
coins flipped in, heads-tails is still a different result from tails-heads, and 
each must be counted. 


And what if you flip a coin three times? In this case, there are actually eight 


results. In case this is getting hard to keep track of, here is a systematic way 
of listing all eight results. 


First Coin Second Coin Third Coin End Result 


Heads Heads Heads HHH 


First Coin Second Coin Third Coin End Result 


Tails HHT 

Heads HTH 
Tails 

Tails HTT 

Heads THH 
Heads 

Tails THT 

Tails 

Heads TTH 
Tails 

Tails TIT 


When you make a table like this, the pattern becomes apparent: each new 
coin doubles the number of possibilities. The chance of three heads in a row 
is =. What would be the chance of four heads in row? 


Let’s take a slightly more complicated—and more interesting—example. 
You are the proud inventor of the SongWriter 2000tm. 


39 66 


The user sets the song speed (“fast,” “medium,” or “slow”); the volume 
(“loud” or “quiet”); and the style (“rock” or “country”). Then, the 
SongWriter automatically writes a song to match. 


How many possible settings are there? You might suspect that the answer is 
3+ 2-+ 2 = 7, but in fact there are many more than that. We can see them 
all on the following “tree diagram.” 


speed fast medium slow 
volume loud quiet loud quiet loud quiet 
te Pde a PP ee te te 
style rock country rock country rock country § rock country rock country § rock country 


If you start at the top of a tree like this and follow all the way down, you 
end up with one particular kind of song: for instance, “fast loud country 
song.” There are 12 different song types in all. This comes from 
multiplying the number of settings for each knob: 3 x 2 x 2 = 12. 


Now, suppose the machine has a “Randomize” setting that randomly 
chooses the speed, volume, and style. What is the probability that you will 
end up with a loud rock song that is not slow? To answer a question like 
this, you can use the following process. 


1. Count the total number of results (the “leaves” in the tree) that match 
your criterion. In this case there are 2: the “fast-loud-rock” and 
“medium-loud-rock” paths. 

2. Count the total number of results: as we said previously, there are 12. 

3. Divide. The probability of a non-slow loud rock song is 2/12, or 1/6. 


Note that this process will always give you a number between 0 (no results 
match) and 1 (all results match). Probabilities are always between 0 (for 
something that never happens) and 1 (for something that is guaranteed to 
happen). 


But what does it really mean to say that “the probability is 1/6?” You aren’t 
going to get 1/6 of a song. One way to make this result more concrete is to 
imagine that you run the machine on its “Randomize” setting 100 times. 
You should expect to get non-slow loud rock songs 1 out of every 6 times; 
roughly 17 songs will match that description. This gives us another way to 
express the answer: there is a 17% probability of any given song matching 
this description. 


The multiplication rule 
We can look at the above problem another way. 


What is the chance that any given, randomly selected song will be non- 
slow? 2 That is to say, 2 out of every three randomly chosen songs will be 
non-slow. 


Now...out of those 2 how many will be loud? Half of them. The 


probability that a randomly selected song is both non-slow and loud is half 


2 1 2 1 
Oli gl Kay Olas 


And now, out of that ~ how many will be rock? Again, half of them: 


5 x =. This leads us back to the conclusion we came to earlier: 1/6 of 


randomly chosen songs will be non-slow, loud, rock songs. But it also gives 
us an example of a very general principle that is at the heart of all 
probability calculations: 


When two events are independent, the probability that they will both 
occur is the probability of one, multiplied by the probability of the 
other. 


What does it mean to describe two events as “independent?” It means that 
they have no effect on each other. In real life, we know that rock songs are 
more likely to be fast and loud than slow and quiet. Our machine, however, 
keeps all three categories independent: choosing “Rock” does not make a 
song more likely to be fast or slow, loud or quiet. 


In some cases, applying the multiplication rule is very straightforward. 
Suppose you generate two different songs: what is the chance that they will 
both be fast songs? The two songs are independent of each other, so the 

1 


chance is = x - =F: 

Now, suppose you generate five different songs. What is the chance that 
they will all be fast? = x $ x = x $ x os or eae or 1 chance in 243. 
Not very likely, as you might suspect! 


Other cases are less obvious. Suppose you generate five different songs. 
What is the probability that none of them will be a fast song? The 
multiplication rule tells us only how to find the probability of “this and 
that”; how can we apply it to this question? 


The key is to reword the question, as follows. What is the chance that the 
first song will not be fast, and the second song will not be fast, and the 
third song will not be fast, and so on? Expressed in this way, the question is 
a perfect candidate for the multiplication rule. The probability of the first 


song being non-fast is 2. Same for the second, and so on. So the 


probability is (2)> or 32/243, or roughly 13%. 


Based on this, we can easily answer another question: if you generate five 
different songs, what is the probability that at least one of them will be 
fast? Once again, the multiplication rule does not apply directly here: it tells 
us “this and that,” not “this or that.” But we can recognize that this is the 
opposite of the previous question. We said that 13% of the time, none of the 
songs will be fast. That means that the other 87% of the time, at least one 
of them will! 


Probability Concepts -- Permutations 
An introductions to permutations. 


In the game of “Solitaire” (also known as “Patience” or 
“Klondike”), seven cards are dealt out at the beginning, as 
shown to the left: one face-up, and the other six face-down. (A 
bunch of other cards are dealt out too, but let’s ignore that right 
now.)A complete card deck has 52 cards. Assuming that all you 
know is the 7 of spades showing, how many possible “hands” 
(the other six cards) could be showing underneath? What makes 
this a “permutations” problem is that order matters: if an ace is 
hiding somewhere in those six cards, it makes a big difference if 
the ace is on the first position, the second, etc. Permutations 
problems can always be addressed as an example of the 
multiplication rule, with one small twist. 


¢ QuestionHow many cards might be in the first position, directly under 
the showing 7? 

e Answer51. That card could be anything except the 7 of spades. 

¢ QuestionFor any given card in first position, how many cards might 
be in second position? 

e Answer50. The seven of spades, and the next card, are both “spoken 
for.” So there are 50 possibilities left in this position. 

¢ QuestionSo how many possibilities are there for the first two positions 
combined? 

e Answer51 x 50, or 2,550. 

¢ QuestionSo how many possibilities are there for all six positions? 

e Answer 51 x 50 x 49 x 48 x 47 x 46, or approximately 1.3 x 107°; 
about 10 billion possibilities! 


This result can be expressed (and typed into a calculator) more concisely by 
using factorials. 


A “factorial” (written with an exclamation mark) means “multiply all the 
numbers from 1 up to this number.” So 5! means 1 xX 2x 3x 4x 5= 120 


What is t Well, it is BROT of course. Most of the terms cancel, 


leaving ail 6x 7 = 42. 


And what about 2 a ora or it you write out all the terms, you can see that the first 


45 terms cancel, leaving 46 x 47 x 48 x 49 x 50 x 51, which is the 
number of permutations we want. So instead of typing into your calculator 
six numbers to multiply (or sixty numbers or six hundred, depending on the 
problem), you can always find the answer to a permutation problem by 
dividing two factorials. In many calculators, the factorial option is located 
under the “probability” menu for this reason. 


Probability Concepts -- Combinations 
An introduction to combinations. 


Let’s start once again with a deck of 52 cards. But this time, let’s deal out a 
poker hand (5 cards). How many possible poker hands are there? 


At first glance, this seems like a minor variation on the Solitaire question 
above. The only real difference is that there are five cards instead of six. 
But in face, there is a more important difference: order does not matter. 
We do not want to count “Ace-King-Queen-Jack-Ten of spades” and “Ten- 
Jack-Queen-King-Ace of spades” separately; they are the same poker hand. 


To approach such question, we begin with the permutations question: how 
many possible poker hands are there, if order does matter? 

52 x 51 x 50 x 49 x 48, or on But we know that we are counting every 
possible hand many different times in this calculation. How many times? 


The key insight is that this second question—“How many different times 
are we counting, for instance, Ace-King-Queen-Jack-Ten of spades?”—is 
itself a permutations question! It is the same as the question “How many 
different ways can these five cards be rearranged in a hand?” There are five 
possibilities for the first card; for each of these, four for the second; and so 
on. The answer is 5! which is 120. So, since we have counted every 
possible hand 120 times, we divide our earlier result by 120 to find that 
there are ann or about 2.6 Million possibilities. 

This question—“how many different 5-card hands can be made from 52 
cards?”—turns out to have a surprisingly large number of applications. 
Consider the following questions: 


e A school offers 50 classes. Each student must choose 6 of them to fill 
out a schedule. How many possible schedules can be made? 

e A basketball team has 12 players, but only 5 will start. How many 
possible starting teams can they field? 

e Your computer contains 300 videos, but you can only fit 10 of them on 
your iPod. How many possible ways can you load your iPod? 


Each of these is a combinations question, and can be answered exactly like 
our card scenario. Because this type of question comes up in so many 
different contexts, it is given a special name and symbol. The last question 


300 
would be referred to as “300 choose 10” and written ( 10 ) .Itis 


00! 


300! 
calculated, of course, as (290/)(10/) 


for reasons explained above. 


Tree Diagrams (optional) 

This module introduces tree diagrams as a method for making some probability 
problems easier to solve. This module is included in the Elementary Statistics 
textbook/collection as an optional lesson. 


A tree diagram is a special type of graph used to determine the outcomes of an 
experiment. It consists of "branches" that are labeled with either frequencies or 
probabilities. Tree diagrams can make some probability problems easier to visualize 
and solve. The following example illustrates how to use a tree diagram. 


Example: 

In an urn, there are 11 balls. Three balls are red (A) and 8 balls are blue (B). Draw 
two balls, one at a time, with replacement. "With replacement" means that you put 
the first ball back in the urn before you select the second ball. The tree diagram using 
frequencies that show all the possible outcomes follows. 


24BR 24RB 


Total = 64+ 24+ 244+ 9= 121 


The first set of branches represents the first draw. The second set of branches 
represents the second draw. Each of the outcomes is distinct. In fact, we can list each 
red ball as R1, R2, and R3 and each blue ball as B1, B2, B3, B4, B5, B6, B7, and 
B8. Then the 9 RR outcomes can be written as: 

R1R1 R1R2 R1R3 R2R1 R2R2 R2R3 R3R1 R3R2 R3R3 

The other outcomes are similar. 

There are a total of 11 balls in the urn. Draw two balls, one at a time, and with 
replacement. There are 11 - 11 = 121 outcomes, the size of the sample space. 
Exercise: 


Problem: List the 24 BR outcomes: B1R1, B1R2, B1R3, ... 
Solution: 
B1R1 B1R2 B1R3 B2R1 B2R2 B2R3 B3R1 B3R2 B3R3 B4R1 B4R2 


B4R3 B5R1 B5R2 B5R3 B6R1 B6R2 B6R3 B7R1 B7R2 B7R3 B&R1 
B8R2 B8&R3 


Exercise: 


Problem: Using the tree diagram, calculate P(RR). 
Solution: 


9 33 Oo — 


Exercise: 


Problem: Using the tree diagram, calculate P(RB OR BR). 


Solution: 


P(RBORBR)=2-4+2-4=2 
Exercise: 


Problem: 


Using the tree diagram, calculate P(R on 1st draw AND B on 2nd draw). 


Solution: 


P(R on 1st draw AND B on 2nd draw) = P(RB) = 4-4 = 


Exercise: 


Problem: 
Using the tree diagram, calculate P(R on 2nd draw given B on 1st draw). 
Solution: 


P(R on 2nd draw given B on Ist draw) = P(R on 2nd | Bon Ist) = 4 = 4 


This problem is a conditional. The sample space has been reduced to those 
outcomes that already have a blue on the first draw. There are 24 + 64 = 88 
possible outcomes Ce BR and 64 BB). Twenty-four of the 88 possible 


outcomes are BR. = = ea 


Exercise: 


Problem: Using the tree diagram, calculate P(BB). 


Solution: 

P(BB) = = 
Exercise: 

Problem: 


Using the tree diagram, calculate 
P(B on the 2nd draw given R on the first draw). 


Solution: 
P(B on 2nd draw | R on Ist draw) = -- 


There are 9 + 24 outcomes that have R on the first draw (9 RR and 24 RB). 
The sample space is then9 + 24 = 33. Twenty-four of the 33 outcomes have 
B on the second draw. The probability is then _ 


Example: 
An urn has 3 red marbles and 8 blue marbles in it. Draw two marbles, one at a time, 
this time without replacement from the urn. "Without replacement" means that you 
do not put the first ball back before you select the second ball. Below is a tree 
diagram. The branches are labeled with probabilities instead of frequencies. The 
numbers at the ends of the branches are ae by maui the numbers on the 
two corresponding branches, for example 


oa =o 


1* Draw 


56 24 24 6 
110 110 110 110 
BB BR RB RR 


Total — 26t24+24+6 _ 110 _ 4 


110 110 


Note:If you draw a red on the first draw from the 3 red possibilities, there are 2 red 
left to draw on the second draw. You do not put back or replace the first ball after 
you have drawn it. You draw without replacement, so that on the second draw there 
are 10 marbles left in the urn. 


Calculate the following probabilities using the tree diagram. 
Exercise: 


Problem: P(RR) = 


Solution: 
3 2 6 
P(RR) = 3 - ig = iw 
Exercise: 
Problem: Fill in the blanks: 


ELSE OUR IB) sec ape te (ae 


Solution: 


P(RB or BR) = 2-84 (£)(2)= 4 


Exercise: 


Problem: P(R on 2d | B on 1st) = 

Solution: 

P(R on 2d|Bonist) = 4 
Exercise: 

Problem: Fill in the blanks: 


P(R on 1st and B on 2nd) = P(RB) = (__)(__) = + 


110 
Solution: 
P eR onelstandb on 2nd). — F(R) (3) (4) = 2. 
Exercise: 
Problem: P(BB) = 
Solution: 


P(BB) = = - + 


Exercise: 


Problem: P(B on 2nd | R on Ist) = 
Solution: 


There are 6 + 24 outcomes that have R on the first draw (6 RR and 24 RB). 


The 6 and the 24 are frequencies. They are also the numerators of the fractions 
acm and “. The sample space is no longer 110 but6 + 24 = 30. Twenty- 
four of the 30 outcomes have B on the second draw. The probability is then a. 


Did you get this answer? 


If we are using probabilities, we can label the tree in the following general way. 


PR and R)= P(RR) 


R on 2nd | R on 1st 
B on 2nd | R on 1st 
R on 2nd | B on 1st 
B on 2nd | B on 1st 


SS” 


) here means P 
) here means P 
) here means P 
) here means P 


EEN oN eae aa 
~=_—"{$" ~~” 


Glossary 


Sample Space 
The set of all possible outcomes of an experiment. 


Tree Diagram 
The useful visual representation of a sample space and events in the form of a 
“tree” with branches marked by possible outcomes simultaneously with 
associated probabilities (frequencies, relative frequencies). 


Probability Topics 
This module introduces the concept of Probability, the chance of an event 
occurring. 


Student Learning Outcomes 
By the end of this chapter, the student should be able to: 


e Understand and use the terminology of probability. 

e Determine whether two events are mutually exclusive and whether two 
events are independent. 

¢ Calculate probabilities using the Addition Rules and Multiplication 
Rules. 

e Construct and interpret Contingency Tables. 

e Construct and interpret Venn Diagrams (optional). 

e Construct and interpret Tree Diagrams (optional). 


Introduction 


It is often necessary to "guess" about the outcome of an event in order to 
make a decision. Politicians study polls to guess their likelihood of winning 
an election. Teachers choose a particular course of study based on what they 
think students can comprehend. Doctors choose the treatments needed for 
various diseases based on their assessment of likely results. You may have 
visited a casino where people play games chosen because of the belief that 
the likelihood of winning is good. You may have chosen your course of 
study based on the probable availability of jobs. 


You have, more than likely, used probability. In fact, you probably have an 
intuitive sense of probability. Probability deals with the chance of an event 
occurring. Whenever you weigh the odds of whether or not to do your 
homework or to study for an exam, you are using probability. In this 
chapter, you will learn to solve probability problems using a systematic 
approach. 


Optional Collaborative Classroom Exercise 


Your instructor will survey your class. Count the number of students in the 
class today. 


e Raise your hand if you have any change in your pocket or purse. 
Record the number of raised hands. 

e Raise your hand if you rode a bus within the past month. Record the 
number of raised hands. 

e Raise your hand if you answered "yes" to BOTH of the first two 
questions. Record the number of raised hands. 


Use the class data as estimates of the following probabilities. P(change) 
means the probability that a randomly chosen person in your class has 
change in his/her pocket or purse. P(bus) means the probability that a 
randomly chosen person in your class rode a bus within the last month and 
so on. Discuss your answers. 


e Find P(change). 

e Find P(bus). 

e Find P(change and bus) Find the probability that a randomly chosen 
student in your class has change in his/her pocket or purse and rode a 
bus within the last month. 

e Find P(change| bus) Find the probability that a randomly chosen 
student has change given that he/she rode a bus within the last month. 
Count all the students that rode a bus. From the group of students who 
rode a bus, count those who have change. The probability is equal to 
those who have change and rode a bus divided by those who rode a 
bus. 


Probability 
This module introduces the concept of probability as a mathematical 
measure of randomness, including a number of real-world applications. 


Probability is a mathematical tool used to study randomness. It deals with 
the chance (the likelihood) of an event occurring. For example, if you toss a 
fair coin 4 times, the outcomes may not be 2 heads and 2 tails. However, if 
you toss the same coin 4,000 times, the outcomes will be close to half heads 
and half tails. The expected theoretical probability of heads in any one toss 
is + or 0.5. Even though the outcomes of a few repetitions are uncertain, 
there is a regular pattern of outcomes when there are many repetitions. 
After reading about the English statistician Karl Pearson who tossed a coin 
24,000 times with a result of 12,012 heads, one of the authors tossed a coin 
996 


2,000 times. The results were 996 heads. The fraction >555 is equal to 0.498 


which is very close to 0.5, the expected probability. 


The theory of probability began with the study of games of chance such as 
poker. Predictions take the form of probabilities. To predict the likelihood 
of an earthquake, of rain, or whether you will get an A in this course, we 
use probabilities. Doctors use probability to determine the chance of a 
vaccination causing the disease the vaccination is supposed to prevent. A 
stockbroker uses probability to determine the rate of return on a client's 
investments. You might use probability to decide to buy a lottery ticket or 
not. In your study of statistics, you will use the power of mathematics 
through probability calculations to analyze and interpret your data. 


Glossary 


Probability 
A number between 0 and 1, inclusive, that gives the likelihood that a 
specific event will occur. The foundation of statistics is given by the 
following 3 axioms (by A. N. Kolmogorov, 1930’s): Let S denote the 
sample space and A and B are two events in S . Then: 


© 0<P(A)<1. 
e If A and B are any two mutually exclusive events, then 
P(A or B) = P(A) + P(B). 


Terminology 

Probability: Terminology is part of the collection col10555 written by 
Barbara Illowsky and Susan Dean defines key terms related to Probability 
and has contributions from Roberta Bloom. 


Probability is a measure that is associated with how certain we are of 
outcomes of a particular experiment or activity. An experiment is a 
planned operation carried out under controlled conditions. If the result is 
not predetermined, then the experiment is said to be a chance experiment. 
Flipping one fair coin twice is an example of an experiment. 


The result of an experiment is called an outcome. A sample space is a set 
of all possible outcomes. Three ways to represent a sample space are to list 
the possible outcomes, to create a tree diagram, or to create a Venn diagram. 
The uppercase letter S' is used to denote the sample space. For example, if 
you flip one fair coin, S = {H, T} where H = heads and T = tails are the 
outcomes. 


An event is any combination of outcomes. Upper case letters like A and B 
represent events. For example, if the experiment is to flip one fair coin, 
event A might be getting at most one head. The probability of an event A is 
written P(A). 


The probability of any outcome is the long-term relative frequency of 
that outcome. Probabilities are between 0 and 1, inclusive (includes 0 and 
1 and all numbers between these values). P(A) = 0 means the event A can 
never happen. P(A) = 1 means the event A always happens. P(A) = 0.5 
means the event A is equally likely to occur or not to occur. For example, if 
you flip one fair coin repeatedly (from 20 to 2,000 to 20,000 times) the 
relative fequency of heads approaches 0.5 (the probability of heads). 


Equally likely means that each outcome of an experiment occurs with 
equal probability. For example, if you toss a fair, six-sided die, each face 
(1, 2, 3, 4, 5, or 6) is as likely to occur as any other face. If you toss a fair 
coin, a Head(H) and a Tail(T) are equally likely to occur. If you randomly 
guess the answer to a true/false question on an exam, you are equally likely 
to select a correct answer or an incorrect answer. 


To calculate the probability of an event A when all outcomes in the 
sample space are equally likely, count the number of outcomes for event 
A and divide by the total number of outcomes in the sample space. For 
example, if you toss a fair dime and a fair nickel, the sample space is 
{HH, TH, HT, TT} where T = tails and H = heads. The sample space 
has four outcomes. A = getting one head. There are two outcomes 

{HT, TH}. P(A) =2. 
Suppose you roll one fair six-sided die, with the numbers {1,2,3,4,5,6} on 
its faces. Let event & = rolling a number that is at least 5. There are two 
outcomes {5, 6}. P(E) =2. If you were to roll the die only a few times, 


you would not be surprised if your observed results did not match the 
probability. If you were to roll the die a very large number of times, you 
would expect that, overall, 2/6 of the rolls would result in an outcome of "at 
least 5". You would not expect exactly 2/6. The long-term relative 
frequency of obtaining this result would approach the theoretical probability 
of 2/6 as the number of repetitions grows larger and larger. 


This important characteristic of probability experiments is the known as the 
Law of Large Numbers: as the number of repetitions of an experiment is 
increased, the relative frequency obtained in the experiment tends to 
become closer and closer to the theoretical probability. Even though the 
outcomes don't happen according to any set pattern or order, overall, the 
long-term observed relative frequency will approach the theoretical 
probability. (The word empirical is often used instead of the word 
observed.) The Law of Large Numbers will be discussed again in Chapter 7. 


It is important to realize that in many situations, the outcomes are not 
equally likely. A coin or die may be unfair, or biased . Two math 
professors in Europe had their statistics students test the Belgian 1 Euro 
coin and discovered that in 250 trials, a head was obtained 56% of the time 
and a tail was obtained 44% of the time. The data seem to show that the 
coin is not a fair coin; more repetitions would be helpful to draw a more 
accurate conclusion about such bias. Some dice may be biased. Look at the 
dice in a game you have at home; the spots on each face are usually small 
holes carved out and then painted to make the spots visible. Your dice may 
or may not be biased; it is possible that the outcomes may be affected by the 


slight weight differences due to the different numbers of holes in the faces. 
Gambling casinos have a lot of money depending on outcomes from rolling 
dice, so casino dice are made differently to eliminate bias. Casino dice have 
flat faces; the holes are completely filled with paint having the same density 
as the material that the dice are made out of so that each face is equally 
likely to occur. Later in this chapter we will learn techniques to use to work 
with probabilities for events that are not equally likely. 


"OR" Event: 

An outcome is in the event A OR B if the outcome is in A or is in B or is 
in both A and B. For example, let A = {1, 2, 3, 4, 5} and 

B= {4, 5,6, 7,8}. A OR B = (1, 2, 3, 4, 5, 6, 7, 8}. Notice that 4 
and 5 are NOT listed twice. 


"AND" Event: 

An outcome is in the event A AND B if the outcome is in both A and B at 
the same time. For example, let A and B be {1, 2, 3, 4, 5} and 

{4, 5, 6, 7, 8}, respectively. Then A AND B = {4, 5}. 


The complement of event A is denoted A’ (read "A prime"). A’ consists of 
all outcomes that are NOT in A. Notice that P(A) + P(A’) = 1. For 
example, let S = {1, 2, 3, 4, 5, 6} and let A = {1, 2, 3, 4}. Then, 

A’ = {5, 6}. P(A) =4, P(A’) =2, and P(A) + P(A’) =4+2=1 
The conditional probability of A given B is written P(A|B). P(A|B) is 
the probability that event A will occur given that the event B has already 
occurred. A conditional reduces the sample space. We calculate the 
probability of A from the reduced sample space B. The formula to calculate 
P(A|B) is 


P(A|B)= SST a B) 


where P(B) is greater than 0. 
For example, suppose we toss one fair, six-sided die. The sample space 


S = {1, 2, 3, 4, 5, 6}. Let A = face is 2 or 3 and B = face is even (2, 4, 6). 
To calculate P(A|B), we count the number of outcomes 2 or 3 in the 


sample space B = {2, 4, 6}. Then we divide that by the number of 
outcomes in B (and not S). 


We get the same result by using the formula. Remember that S has 6 
outcomes. 


P(A|B) = 
P(A andB) __ (the number of outcomes that are 2 or 3 andeveninS)/6 1/6 1 
P(B) (the number of outcomes that are even in S) / 6 ~ 3/6 3 


Understanding Terminology and Symbols 

It is important to read each problem carefully to think about and understand 
what the events are. Understanding the wording is the first very important 
step in solving probability problems. Reread the problem several times if 
necessary. Clearly identify the event of interest. Determine whether there is 
a condition stated in the wording that would indicate that the probability is 
conditional; carefully identify the condition, if any. 

Exercise: 


Problem: 


In a particular college class, there are male and female students. Some 
students have long hair and some students have short hair. Write the 
symbols for the probabilities of the events for parts (a) through (j) 
below. (Note that you can't find numerical answers here. You were not 
given enough information to find any probability values yet; 
concentrate on understanding the symbols.) 


e Let F be the event that a student is female. 

e Let M be the event that a student is male. 

e Let S be the event that a student has short hair. 
e Let L be the event that a student has long hair. 


e a The probability that a student does not have long hair. 

¢ b The probability that a student is male or has short hair. 

¢ c The probability that a student is a female and has long hair. 

e d The probability that a student is male, given that the student has 
long hair. 


e e The probability that a student has long hair, given that the 
student is male. 

e f Of all the female students, the probability that a student has 
short hair. 

e g Of all students with long hair, the probability that a student is 
female. 

e h The probability that a student is female or has long hair. 

e i The probability that a randomly selected student is a male 
student with short hair. 

e j The probability that a student is female. 


Solution: 


e a P(L')=P(S) 
b P(M or S) 
c P(F and L) 
d P(M|L) 

e P(L|M) 

f P(S|F) 

g P(F|L) 

h P(F or L) 
e i P(M andS) 
PCP) 


**With contributions from Roberta Bloom 


Glossary 


Conditional Probability 
The likelihood that an event will occur given that another event has 
already occurred. 


Equally Likely 
Each outcome of an experiment has the same probability. 


Experiment 
A planned activity carried out under controlled conditions. 


Event 
A subset in the set of all outcomes of an experiment. The set of all 
outcomes of an experiment is called a sample space and denoted 
usually by S. An event is any arbitrary subset in S. It can contain one 
outcome, two outcomes, no outcomes (empty subset), the entire 
sample space, etc. Standard notations for events are capital letters such 
as A, B, CG, etc. 


Outcome (observation) 
A particular result of an experiment. 


Probability 
A number between 0 and 1, inclusive, that gives the likelihood that a 
specific event will occur. The foundation of statistics is given by the 
following 3 axioms (by A. N. Kolmogorov, 1930’s): Let S denote the 
sample space and A and B are two events in S'. Then: 


* 0 < P(A) <1; 

e If A and B are any two mutually exclusive events, then 
P(Aor B) = P(A) + P(B). 

* P(S)=1. 


Sample Space 
The set of all possible outcomes of an experiment. 


Two Basic Rules of Probability 
This module introduces the multiplication and addition rules used when calculating 
probabilities. 


The Multiplication Rule 


If A and B are two events defined on a sample space, then: 
P(A AND B) = P(B)- P(A|B). 


This rule may also be written as : P(A|B)= ee ae 
(The probability of A given B equals the probability of A and B divided by the 
probability of B.) 


If A and B are independent, then P(A|B) = P(A). Then 
P(A AND B) = P(A|B) P(B) becomes P(A AND B) = P(A) P(B). 


The Addition Rule 


If A and B are defined on a sample space, then: 
P(A OR B) = P(A) + P(B) — P(A AND B). 


If A and B are mutually exclusive, then P(A AND B) = 0. Then 
P(A OR B) = P(A) + P(B) — P(A AND B) becomes 
P(A OR B) = P(A) + P(B). 


Example: 
Klaus is trying to choose where to go on vacation. His two choices are: A = New 
Zealand and B = Alaska 


¢ Klaus can only afford one vacation. The probability that he chooses A is 
P(A) = 0.6 and the probability that he chooses B is P(B) = 0.35. 

e P(A and B) = 0 because Klaus can only afford to take one vacation 

e Therefore, the probability that he chooses either New Zealand or Alaska is 
P(A OR B) = P(A) + P(B) = 0.6 + 0.35 = 0.95. Note that the probability 
that he does not choose to go anywhere on vacation must be 0.05. 


Example: 


Carlos plays college soccer. He makes a goal 65% of the time he shoots. Carlos is going 
to attempt two goals in a row in the next game. 

A = the event Carlos is successful on his first attempt. P(A) = 0.65. B = the event 
Carlos is successful on his second attempt. P(B) = 0.65. Carlos tends to shoot in 
streaks. The probability that he makes the second goal GIVEN that he made the first 
goal is 0.90. 

Exercise: 


Problem: What is the probability that he makes both goals? 


Solution: 


The problem is asking you to find P(A AND B) = P(B AND A). Since 
P(B|A) = 0.90: 
Equation: 


P(B AND A) = P(BJA) P(A) = 0.90*0.65 = 0.585 


Carlos makes the first and second goals with probability 0.585. 
Exercise: 


Problem: 
What is the probability that Carlos makes either the first goal or the second goal? 
Solution: 


The problem is asking you to find P(A OR B). 
Equation: 


P(A ORB) = P(A) + P(B) — P(A AND B) = 0.65 + 0.65 — 0.585 = 0.715 


Carlos makes either the first goal or the second goal with probability 0.715. 
Exercise: 


Problem: Are A and B independent? 


Solution: 


No, they are not, because P(B AND A) = 0.585. 
Equation: 


P(B) - P(A) = (0.65) - (0.65) = 0.423 
Equation: 


0.423 4 0.585 = P(B AND A) 


So, P(B AND A) is not equal to P(B) - P(A). 


Exercise: 


Problem: Are A and B mutually exclusive? 


Solution: 
No, they are not because P(A and B) = 0.585. 


To be mutually exclusive, P(A AND B) must equal 0. 


Example: 

A community swim team has 150 members. Seventy-five of the members are advanced 
swimmers. Forty-seven of the members are intermediate swimmers. The remainder are 
novice swimmers. Forty of the advanced swimmers practice 4 times a week. Thirty of 
the intermediate swimmers practice 4 times a week. Ten of the novice swimmers 
practice 4 times a week. Suppose one member of the swim team is randomly chosen. 
Answer the questions (Verify the answers): 

Exercise: 


Problem: What is the probability that the member is a novice swimmer? 


Solution: 


28 
150 


Exercise: 


Problem: What is the probability that the member practices 4 times a week? 


Solution: 


80 
150 


Exercise: 
Problem: 


What is the probability that the member is an advanced swimmer and practices 4 
times a week? 


Solution: 


40 
150 


Exercise: 
Problem: 
What is the probability that a member is an advanced swimmer and an 


intermediate swimmer? Are being an advanced swimmer and an intermediate 
swimmer mutually exclusive? Why or why not? 


Solution: 


P(advanced AND intermediate) = 0, so these are mutually exclusive events. 
A swimmer cannot be an advanced swimmer and an intermediate swimmer at the 
same time. 


Exercise: 


Problem: 


Are being a novice swimmer and practicing 4 times a week independent events? 
Why or why not? 


Solution: 


No, these are not independent events. 


Equation: 

P(novice AND practices 4 times per week) = 0.0667 
Equation: 

P(novice) - P(practices 4 times per week) = 0.0996 
Equation: 


0.0667 ¢ 0.0996 


Example: 

Studies show that, if she lives to be 90, about 1 woman in 7 (approximately 14.3%) will 
develop breast cancer. Suppose that of those women who develop breast cancer, a test is 
negative 2% of the time. Also suppose that in the general population of women, the test 
for breast cancer is negative about 85% of the time. Let B = woman develops breast 
cancer and let NV = tests negative. Suppose one woman is selected at random. 

Exercise: 


Problem: 


What is the probability that the woman develops breast cancer? What is the 
probability that woman tests negative? 


Solution: 


P= 0143 SPN) 0:85 
Exercise: 


Problem: 


Given that the woman has breast cancer, what is the probability that she tests 
negative? 


Solution: 


P(N|B) = 0.02 
Exercise: 


Problem: 
What is the probability that the woman has breast cancer AND tests negative? 
Solution: 


Pe AND N)— PB) PNB) = (0143) = (0:02) — 00029 
Exercise: 


Problem: 
What is the probability that the woman has breast cancer or tests negative? 


Solution: 


P(B ORN) = P(B) + P(N) — P(B AND N) = 0.143 + 0.85 — 0.0029 = 0.9901 


Exercise: 


Problem: Are having breast cancer and testing negative independent events? 


Solution: 


No. P(N) = 0.85; P(N|B) = 0.02. So, P(N|B) does not equal P(N) 


Exercise: 


Problem: Are having breast cancer and testing negative mutually exclusive? 


Solution: 


No. P(B AND N) = 0.0029. For B and N to be mutually exclusive, 
P(B AND N) must be 0. 


Glossary 


Independent Events 
The occurrence of one event has no effect on the probability of the occurrence of 
any other event. Events A and B are independent if one of the following is true: (1). 
P(A|B) = P(A); (2) P(B|A) = P(B); (3) P(Aand B) = P(A)P(B). 


Mutually Exclusive 
An observation cannot fall into more than one class (category). Being in more than 
one category prevents being in a mutually exclusive category. 


Sample Space 
The set of all possible outcomes of an experiment. 


Independent and Mutually Exclusive Events 

Probability: Independent and Mutually Exclusive Events is part of the 
collection col10555 written by Barbara Illowsky and Susan Dean and 
explains the concept of independent events, where the probability of event 
A does not have any effect on the probability of event B, and mutually 
exclusive events, where events A and B cannot occur at the same time. The 
module has contributions from Roberta Bloom. 


Independent and mutually exclusive do not mean the same thing. 


Independent Events 


Two events are independent if the following are true: 


» P(A|B) = P(A) 


Two events A and B are independent if the knowledge that one occurred 
does not affect the chance the other occurs. For example, the outcomes of 
two roles of a fair die are independent events. The outcome of the first roll 
does not change the probability for the outcome of the second roll. To show 
two events are independent, you must show only one of the above 
conditions. If two events are NOT independent, then we say that they are 
dependent. 


Sampling may be done with replacement or without replacement. 


¢ With replacement: If each member of a population is replaced after it 
is picked, then that member has the possibility of being chosen more 
than once. When sampling is done with replacement, then events are 
considered to be independent, meaning the result of the first pick will 
not change the probabilities for the second pick. 

e Without replacement:: When sampling is done without replacement, 
then each member of a population may be chosen only once. In this 
case, the probabilities for the second pick are affected by the result of 


the first pick. The events are considered to be dependent or not 
independent. 


If it is not known whether A and B are independent or dependent, assume 
they are dependent until you can show otherwise. 


Mutually Exclusive Events 


A and B are mutually exclusive events if they cannot occur at the same 
time. This means that A and B do not share any outcomes and 
P(A AND B) = 0. 


For example, suppose the sample space S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. 
Let A = {1, 2,3, 4,5}, B = {4, 5, 6, 7, 8}, and C = {7, 9}. 

A AND B = {4,5}. P(A AND B) = and is not equal to zero. 
Therefore, A and B are not mutually exclusive. A and C do not have any 
numbers in common so P(A AND C) = 0. Therefore, A and Care 
mutually exclusive. 


If it is not known whether A and B are mutually exclusive, assume they 
are not until you can show otherwise. 


The following examples illustrate these definitions and terms. 


Example: 

Flip two fair coins. (This is an experiment.) 

The sample space is {HH, HT, TH, TT} where T = tails and H = heads. 
The outcomes are HH, HT, TH, and TT. The outcomes HT and TH are 
different. The HT means that the first coin showed heads and the second 
coin showed tails. The ‘TH means that the first coin showed tails and the 
second coin showed heads. 


e Let A = the event of getting at most one tail. (At most one tail means 
0 or 1 tail.) Then A can be written as {HH, HT, TH}. The outcome 
HH shows 0 tails. HT and TH each show 1 tail. 


¢ Let B= the event of getting all tails. B can be written as {TT}. B is 
the complement of A. So, B = A’. Also, 
P(A) + P(B) = P(A) + P(A’) = 1. 

e The probabilities for A and for B are P(A) = + and P(B) = i. 

e Let C = the event of getting all heads. C = {HH}. Since B = {TT}, 
P(B AND C) = 0. Band C are mutually exclusive. (B and C have 
no members in common because you cannot have all tails and all 
heads at the same time.) 

e Let D = event of getting more than one tail. D = {TT}. P(D) = + 

¢ Let & = event of getting a head on the first roll. (This implies you can 
get either a head or tail on the second roll.) FE = {HT, HH}. 
Ph). 

e Find the probability of getting at least one (1 or 2) tail in two flips. 
Let F' = event of getting at least one tail in two flips. 
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Example: 

Roll one fair 6-sided die. The sample space is {1, 2, 3, 4, 5, 6}. Let event 
A =a face is odd. Then A = {1, 3, 5}. Let event B =a face is even. Then 
B = {2, 4, 6}. 


e Find the complement of A, A’. The complement of A, A’, is B 
because A and B together make up the sample space. 
P(A) + P(B) = P(A) + P(A’) = 1. Also, P(A) = 3 and 
P(B) = ¢ 

e Let event C = odd faces larger than 2. Then C' = {3, 5}. Let event D 
= all even faces smaller than 5. Then D = {2,4}. P(C and D) = 0 
because you cannot have an odd and even face at the same time. 
Therefore, C' and D are mutually exclusive events. 

e Let event F = all faces less than 5. H = {1, 2, 3, 4}. 
Exercise: 


Problem: 


Are Cand E mutually exclusive events? (Answer yes or no.) 
Why or why not? 


Solution: 


No. C = {3, 5} and E = {1, 2, 3, 4}. P(C AND E) = <. Tobe 
mutually exclusive, P(C AND E) must be 0. 


e Find P(C|A). This is a conditional. Recall that the event C’ is {3, 5} 
and event A is {1, 3, 5}. To find P(C|A), find the probability of C 
using the sample space A. You have reduced the sample space from 
the original sample space {1, 2, 3, 4, 5, 6} to {1, 3, 5}. So, 

P(C|A) = 4 


Example: 

Let event G = taking a math class. Let event H = taking a science class. 
Then, G AND H = taking a math class and a science class. Suppose 
P(G) = 0.6, P(H) = 0.5, and P(G AND H) = 0.3. Are G and H 
independent? 

If G and A are independent, then you must show ONE of the following: 


» P(G|H) = P(G) 


Note:The choice you make depends on the information you have. You 
could choose any of the methods here because you have the necessary 
information. 


Exercise: 


Problem: Show that P(G|H) = P(G). 


Solution: 


P(G AND H 
P(G|H) = “Sa = 38 = 0.6 = PG) 


Exercise: 
Problem: Show P(G AND H) = P(G)- P(H). 


Solution: 


P(G)-P(H) = 0.6-0.5 = 0.3 = P(G AND H) 


Since G and H are independent, then, knowing that a person is taking a 
science class does not change the chance that he/she is taking math. If the 
two events had not been independent (that is, they are dependent) then 
knowing that a person is taking a science class would change the chance 
he/she is taking math. For practice, show that P(H|G) = P(H) to show 
that G and H are independent events. 


Example: 

In a box there are 3 red cards and 5 blue cards. The red cards are marked 
with the numbers 1, 2, and 3, and the blue cards are marked with the 
numbers 1, 2, 3, 4, and 5. The cards are well-shuffled. You reach into the 
box (you cannot see into it) and draw one card. 

Let R = red card is drawn, B = blue card is drawn, & = even-numbered 
card is drawn. 

The sample space S = R1, R2, R3, B1, B2, B3, B4, B5. S has 8 
outcomes. 


e2B Ch 3. P(B) = 2. P(R AND B) = 0. (You cannot draw one 
card that is both red and blue.) 


e P(E) = 3. (There are 3 even-numbered cards, R2, B2, and B4.) 

¢ P(E|B) = 2. (There are 5 blue cards: B1, B2, B3, B4, and Bd. Out 
of the blue cards, there are 2 even cards: B2 and B4.) 

¢ P(B|E) = &. (There are 3 even-numbered cards: R2, B2, and B4. 
Out of the even-numbered cards, 2 are blue: B2 and B4.) 

e The events R and B are mutually exclusive because 
P{R.AND B) = 0: 

e Let G = card with a number greater than 3. G = {B4, B5}. 
B(G) = 2. Let = blue card numbered between 1 and 4, inclusive. 
H = {B1, B2, B3, B4}. P(G|H) = 4. (The only card in H that has 
a number greater than 3 is B4.) Since = +, P(G) = P(G|H) 
which means that G and H are independent. 


Example: 

In a particular college class, 60% of the students are female. 50 % of all 
students in the class have long hair. 45% of the students are female and 
have long hair. Of the female students, 75% have long hair. Let F be the 
event that the student is female. Let L be the event that the student has long 
hair. One student is picked randomly. Are the events of being female and 
having long hair independent? 


e The following probabilities are given in this example: 
eR (R= 060 P(e) 40- 50 

e P(F AND L) = 0.45 

e P(L|F) = 0.75 


Note:The choice you make depends on the information you have. You 
could use the first or last condition on the list for this example. You do not 
know P(F|L) yet, so you can not use the second condition. 


Solution 1 

Check whether P(F and L) = P(F)P(L): We are given that P(F and L) = 0.45 
; but P(F)P(L) = (0.60)(0.50)= 0.30 The events of being female and having 
long hair are not independent because P(F and L) does not equal P(F)P(L). 
Solution 2 

check whether P(L|F) equals P(L): We are given that P(L|F) = 0.75 but 
P(L) = 0.50; they are not equal. The events of being female and having 
long hair are not independent. 

Interpretation of Results 

The events of being female and having long hair are not independent; 
knowing that a student is female changes the probability that a student has 
long hair. 


**Example 5 contributed by Roberta Bloom 


Glossary 


Independent Events 
The occurrence of one event has no effect on the probability of the 
occurrence of any other event. Events A and B are independent if one 
of the following is true: (1). P(A|B) = P(A); (2) P(B|A) = P(B); 
(3) P(Aand B) = P(A)P(B). 


Mutually Exclusive 
An observation cannot fall into more than one class (category). Being 
in more than one category prevents being in a mutually exclusive 
category. 


Contingency Tables 
This module introduces the contingency table as a way of determining conditional 
probabilities. 


A contingency table provides a way of portraying data that can facilitate calculating 
probabilities. The table helps in determining conditional probabilities quite easily. The 
table displays sample values in relation to two different variables that may be dependent 
or contingent on one another. Later on, we will use contingency tables again, but in 
another manner. Contingincy tables provide a way of portraying data that can facilitate 
calculating probabilities. 


Example: 
Suppose a study of speeding violations and drivers who use car phones produced the 
following fictional data: 


Speeding violation in No speeding violation in 

the last year the last year Total 
Cappel 25 280 305 
user 
een ces 45 405 450 
phone user 
Total 70 685 755 


The total number of people in the sample is 755. The row totals are 305 and 450. The 
column totals are 70 and 685. Notice that 305 + 450 = 755 and 70 + 685 = 755. 
Calculate the following probabilities using the table 

Exercise: 


Problem: P(person is a car phone user) = 


Solution: 


number of car phone users __ 305 
total number in study — 755 


Exercise: 


Problem: P(person had no violation in the last year) = 


Solution: 


number that had no violation __ 685 
total number in study 55 


Exercise: 


Problem: 
P(person had no violation in the last year AND was a car phone user) = 


Solution: 


280 
795 


Exercise: 


Problem: 


P(person is a car phone user OR person had no violation in the last year) = 


Solution: 

(355 +755) “755 = 55 
Exercise: 

Problem: 


P(person is a car phone user GIVEN person had a violation in the last year) = 
Solution: 


2. (The sample space is reduced to the number of persons who had a violation.) 
Exercise: 


Problem: 
P(person had no violation last year GIVEN person was not a car phone user) = 


Solution: 


== (The sample space is reduced to the number of persons who were not car phone 


users. ) 


Example: 
The following table shows a random sample of 100 hikers and the areas of hiking 
preferred: 


The Near Lakes and On Mountain 
Sex Coastline Streams Peaks Total 
Female 18 16 ae: 45 
Male — — 14 55 
Total = 41 


Hiking Area Preference 


Exercise: 


Problem: Complete the table. 


Solution: 
The Near Lakes and On Mountain 
Sex Coastline Streams Peaks Total 
Female 18 16 11 45 
Male 16 25 14 55 
Total 34 41 25 100 


Hiking Area Preference 


Exercise: 


Problem: 
Are the events "being female" and "preferring the coastline" independent events? 
Let F' = being female and let C’' = preferring the coastline. 


* aP(F AND C) = 
¢ bP(F) - P(C) = 


Are these two numbers the same? If they are, then F and C’ are independent. If they 
are not, then F’ and C are not independent. 


Solution: 
¢ aP(F AND C) = == =0.18 
¢ bP(F)-P(C) = > - 4a = 0.45 -0.34 = 0.153 


P(F AND C) # P(F) - P(C), so the events F' and C are not independent. 
Exercise: 

Problem: 

Find the probability that a person is male given that the person prefers hiking near 


lakes and streams. Let M = being male and let L = prefers hiking near lakes and 
streams. 


e¢ aWhat word tells you this is a conditional? 
¢ bFill in the blanks and calculate the probability: P(__|__) = 
e cls the sample space for this problem all 100 hikers? If not, what is it? 


Solution: 
e aThe word 'given' tells you that this is a conditional. 
e bP(M|L) = 2 
e cNo, the sample space for this problem is 41. 


Exercise: 


Problem: 


Find the probability that a person is female or prefers hiking on mountain peaks. 
Let F = being female and let P = prefers mountain peaks. 


aP(F) = 
3 DE CE = 
¢ cP(F AND P) = 
e dTherefore, P(F OR P) = 


Solution: 
_ 45 
e aP(F) = Ry 
¢ bP(P) = Fae 
e cP(F AND P) = a 
° dP(FORP)= 3 +4-#=2 
Example: 


Muddy Mouse lives in a cage with 3 doors. If Muddy goes out the first door, the 
probability that he gets caught by Alissa the cat is ~ and the probability he is not caught 
is 2. If he goes out the second door, the probability he gets caught by Alissa is - and 
the probability he is not caught is 3. The probability that Alissa catches Muddy coming 
out of the third door is - and the probability she does not catch Muddy is S$. It is 


equally likely that Muddy will choose any of the three doors so the probability of 
choosing each door is = 


Caught or Not Door One Door Two Door Three Total 
1 1 i 

Caught TH Tol ‘ei 

Not Caught “= — 4 


Caught or Not Door One Door Two Door Three 


Total 


Door Choice 


° The first entry == = (+) (4) is P(Door One AND Caught). 
¢ The entry = = () SS is P(Door One AND Not Caught). 


Verify the remaining entries. 
Exercise: 


Problem: 


Total 


Complete the probability contingency table. Calculate the entries for the totals. 


Verify that the lower-right corner entry is 1. 


Solution: 


Caught or Not Door One Door Two Door Three 


1 1 il 
Cau ght 15 TD ras 
Not Caught = a 4 
Total aa ~ = 


Door Choice 


Exercise: 


Problem: What is the probability that Alissa does not catch Muddy? 


Solution: 


AL 
60 


Exercise: 


Total 


19 
60 


AL 
60 


1 


Problem: 


What is the probability that Muddy chooses Door One OR Door Two given that 
Muddy is caught by Alissa? 


Solution: 


=e 
19 


Note: You could also do this problem by using a probability tree. See the Tree Diagrams 
(Optional) section of this chapter for examples. 


Glossary 


Contingency Table 
The method of displaying a frequency distribution as a table with rows and columns 
to show how two variables may be dependent (contingent) upon each other. The 
table provides an easy way to calculate conditional probabilities. 


Summary of Formulas 

This module provides a review of the probability formulas, including the 
definitions of independent, complementary, and mutually exclusive events 
as well as the addition and multiplication rules. 

Formula 

Complement 


If A and A’ are complements then P(A) + P(A’) = 1 
Formula 
Addition Rule 


P(A OR B) = P(A) + P(B) — P(A AND B) 
Formula 
Mutually Exclusive 


If A and B are mutually exclusive then P(A AND B) = 0; so 
P(A ORB) = P(A) + P(B). 

Formula 

Multiplication Rule 


« P(A AND B) = P(B)P(A[B) 
¢ P(A AND B) = P(A)P(BJA) 


Formula 
Independence 


If A and B are independent then: 


* P(A|B) = P(A) 
* P(BJA) = P(B) 
¢ P(A AND B) = P(A)P(B) 


Practice 1: Contingency Tables 

This module provides the opportunity for students to apply what they've learned about probability to 
solve a series of problems given a set of data. Students will practice constructing and interpreting 
contingency tables. 


Student Learning Outcomes 


e The student will construct and interpret contingency tables. 


Given 

An article in the New England Journal of Medicine , reported about a study of smokers in California 
and Hawaii. In one part of the report, the self-reported ethnicity and smoking levels per day were 
given. Of the people smoking at most 10 cigarettes per day, there were 9886 African Americans, 2745 
Native Hawaiians, 12,831 Latinos, 8378 Japanese Americans, and 7650 Whites. Of the people 
smoking 11-20 cigarettes per day, there were 6514 African Americans, 3062 Native Hawaiians, 4932 
Latinos, 10,680 Japanese Americans, and 9877 Whites. Of the people smoking 21-30 cigarettes per 
day, there were 1671 African Americans, 1419 Native Hawaiians, 1406 Latinos, 4715 Japanese 
Americans, and 6062 Whites. Of the people smoking at least 31 cigarettes per day, there were 759 


African Americans, 788 Native Hawaiians, 800 Latinos, 2305 Japanese Americans, and 3970 Whites. 
((Source: http://www.nejm.org/doi/full/10.1056/NEJMoa033250)) 


Complete the Table 


Complete the table below using the data provided. 


Smoking African Native Japanese 
Level American Hawaiian Latino Americans White TOTALS 


1-10 
11-20 
21-30 
31+ 
TOTALS 


Smoking Levels by Ethnicity 


Analyze the Data 


Suppose that one person from the study is randomly selected. 


Exercise: 


Problem: Find the probability that person smoked 11-20 cigarettes per day. 


Solution: 


35,065 
100,450 


Exercise: 


Problem: Find the probability that person was Latino. 


Solution: 


19,969 
100,450 


Discussion Questions 


Exercise: 


Problem: 


In words, explain what it means to pick one person from the study and that person is “Japanese 
American AND smokes 21-30 cigarettes per day.” Also, find the probability. 
Solution: 


4,715 
100,450 


Exercise: 


Problem: 


In words, explain what it means to pick one person from the study and that person is “Japanese 
American OR smokes 21-30 cigarettes per day.” Also, find the probability. 


Solution: 


36,636 
100,450 


Exercise: 


Problem: 


In words, explain what it means to pick one person from the study and that person is “Japanese 
American GIVEN that person smokes 21-30 cigarettes per day.” Also, find the probability. 


Solution: 


4715 
15,273 


Exercise: 


Problem: Prove that smoking level/day and ethnicity are dependent events. 


Practice 2: Calculating Probabilities 

This module allows students to practice using what they've learned about Probability. Students will apply their 
understanding of basic probability terms, calculate probabilities based on the data provided, and determine whether 
events are independent or mutually exclusive. 


Student Learning Outcomes 
e Students will define basic probability terms. 


e Students will calculate probabilities. 
e Students will determine whether two events are mutually exclusive or whether two events are independent. 


Note: Use probability rules to solve the problems below. Show your work. 


Given 

48% of all Californians registered voters prefer life in prison without parole over the death penalty for a person 
convicted of first degree murder. Among Latino California registered voters, 55% prefer life in prison without 
parole over the death penalty for a person convicted of first degree murder. (Source: 
http://field.com/fieldpollonline/subscribers/RIs2393.pdf ). 

37.6% of all Californians are Latino (Source: U.S. Census Bureau). 

In this problem, let: 


C = Californians (registered voters) preferring life in prison without parole over the death penalty f 
e L = Latino Californians 


Suppose that one Californian is randomly selected. 


Analyze the Data 
Exercise: 
Problem: P(C) = 
Solution: 


0.48 


Exercise: 


Problem: P(L) = 


Solution: 


0.376 


Exercise: 
Problem: P(C|L) = 


Solution: 


0.55 


Exercise: 


Problem: In words, what is " C'|L"? 


Exercise: 


Problem: P(L AND C) = 


Solution: 


0.2068 


Exercise: 


Problem: [In words, what is “Z and C”’? 


Exercise: 


Problem: Are L and C’ independent events? Show why or why not. 


Solution: 
No 


Exercise: 


Problem: P(L OR C) = 


Solution: 
0.6492 


Exercise: 


Problem: [In words, what is “Z or C”? 


Exercise: 


Problem: Are LZ and C’' mutually exclusive events? Show why or why not. 


Solution: 


No 


Homework 

Probability: Homework is part of the collection col10555 written by Barbara Illowsky and Susan 
Dean and provides a number of homework exercises related to Probability with contributions 
from Roberta Bloom. 

Exercise: 


Problem: 


Suppose that you have 8 cards. 5 are green and 3 are yellow. The 5 green cards are 
numbered 1, 2, 3, 4, and 5. The 3 yellow cards are numbered 1, 2, and 3. The cards are well 
shuffled. You randomly draw one card. 


e G=card drawn is green 
e K=card drawn is even-numbered 


e aList the sample space. 

* bP(G) = 

¢ cP(G|E) = 

e dP(G AND £) = 

¢ eP(G ORE) = 

e fAre G and F mutually exclusive? Justify your answer numerically. 


Solution: 


e a{G1, G2, G3, G4, G5, Y1, Y2, Y3} 
° b2 

e c2 
e dz 
e es 


e {No 


Exercise: 


Problem: 


Refer to the previous problem. Suppose that this time you randomly draw two cards, one at a 
time, and with replacement. 


¢ G,= first card is green 
e G»= second card is green 


e aDraw a tree diagram of the situation. 
eb P(G, AND G2) = 

¢ c P(at least one green) = 

¢ d P(Gy | Gi) = 


e eAre G2 and G; independent events? Explain why or why not. 


Exercise: 


Problem: 


Refer to the previous problems. Suppose that this time you randomly draw two cards, one at 
a time, and without replacement. 


G = first card is green 
G 9= second card is green 


aDraw a tree diagram of the situation. 

bP(G, AND Ga) = 

cP(at least one green) = 

dP(G2|G1) = 

eAre Gz» and G, independent events? Explain why or why not. 


Solution: 
© b(3)(F) 
© (33) + (s)(F) + (B)) 
e d 4 
e eNo 
Exercise: 


Problem: Roll two fair dice. Each die has 6 faces. 


aList the sample space. 

bLet A be the event that either a 3 or 4 is rolled first, followed by an even number. Find 
P(A). 

cLet B be the event that the sum of the two rolls is at most 7. Find P(B). 

din words, explain what “P(A|B)” represents. Find P(A|B). 

eAre A and B mutually exclusive events? Explain your answer in 1 - 3 complete 
sentences, including numerical justification. 

fAre A and B independent events? Explain your answer in 1 - 3 complete sentences, 
including numerical justification. 


Exercise: 


Problem: 


A special deck of cards has 10 cards. Four are green, three are blue, and three are red. When 
a card is picked, the color of it is recorded. An experiment consists of first picking a card 
and then tossing a coin. 


e aList the sample space. 


¢ bLet A be the event that a blue card is picked first, followed by landing a head on the 
coin toss. Find P(A). 

¢ cLet B be the event that a red or green is picked, followed by landing a head on the 
coin toss. Are the events A and B mutually exclusive? Explain your answer in 1 - 3 
complete sentences, including numerical justification. 

e dLet C’ be the event that a red or blue is picked, followed by landing a head on the coin 
toss. Are the events A and C’ mutually exclusive? Explain your answer in 1 - 3 
complete sentences, including numerical justification. 


Solution: 


¢ a{GH,GT,BH,BT,RH,RT} 


Exercise: 


Problem: An experiment consists of first rolling a die and then tossing a coin: 


e aList the sample space. 

¢ bLet A be the event that either a 3 or 4 is rolled first, followed by landing a head on the 
coin toss. Find P(A). 

¢ cLet B be the event that a number less than 2 is rolled, followed by landing a head on 
the coin toss. Are the events A and B mutually exclusive? Explain your answer in 1 - 3 
complete sentences, including numerical justification. 


Exercise: 


Problem: 


An experiment consists of tossing a nickel, a dime and a quarter. Of interest is the side the 
coin lands on. 


e aList the sample space. 

¢ bLet A be the event that there are at least two tails. Find P(A). 

e cLet B be the event that the first and second tosses land on heads. Are the events A and 
B mutually exclusive? Explain your answer in 1 - 3 complete sentences, including 
justification. 


Solution: 


- a{(HHH),(HHT),(HTH),(HTT),(THH),(THT),(TTH),(TTT)} 
e b< 
e cYes 


Exercise: 


Problem: Consider the following scenario: 


¢ Let P(C) = 0.4 
* Let P(D) = 0.5 
* Let P(C|D) = 0.6 


¢ aFind P(C AND D). 

bAre C’ and D mutually exclusive? Why or why not? 
cAre C and D independent events? Why or why not? 
dFind P(C ORD). 

eFind P(D|C). 


Exercise: 


Problem: E£ and F mutually exclusive events. P(E) = 0.4; P(F) = 0.5. Find P(E | F). 


Solution: 


0 


Exercise: 


Problem: J and K are independent events. P(J | K) = 0.3. Find P(J) . 


Exercise: 


Problem: U and V are mutually exclusive events. P(U) = 0.26; P(V) = 0.37. Find: 


« aP(U AND V) = 
* bP(U| V) = 
 cP(U OR V) = 


Solution: 


e a0 
« bo 
e c0.63 
Exercise: 
Problem: 
Q and R are independent events. P(Q) = 0.4; P(Q AND R) = 0.1. Find P(R). 


Exercise: 


Problem: Y and Z are independent events. 


¢ a Rewrite the basic Addition Rule P(Y OR Z) = P(Y) + P(Z) — P(Y AND Z) 
using the information that Y and Z are independent events. 
¢ b Use the rewritten rule to find P(Z) if P(Y OR Z) = 0.71 and P(Y) = 0.42. 


Solution: 


e b0.5 
Exercise: 


Problem: G and H are mutually exclusive events. P(G) = 0.5; P(H) = 0.3 


¢ aExplain why the following statement MUST be false: P(H | G) = 0.4. 
¢ bFind: P(H OR G). 
e cAre G and H independent or dependent events? Explain in a complete sentence. 


Exercise: 
Problem: 
The following are real data from Santa Clara County, CA. As of a certain time, there had 


been a total of 3059 documented cases of AIDS in the county. They were grouped into the 
following categories (Source: Santa Clara County Public H.D.): 


IV 
Drug Heterosexual 
Homosexual/Bisexual User* Contact Other Totals 
Female 0 70 136 49 
Male 2146 463 60 135 


Totals a 
* includes homosexual/bisexual IV drug users 


Suppose one of the persons with AIDS in Santa Clara County is randomly selected. 
Compute the following: 


¢ a P(person is female) = 


¢ bP(person has a risk factor Heterosexual Contact) = 


¢ cP(person is female OR has a risk factor of IV Drug User) = 
e dP(person is female AND has a risk factor of Homosexual/Bisexual) = 
¢ eP(person is male AND has a risk factor of IV Drug User) = 
¢ fP(female GIVEN person got the disease from heterosexual contact) = 


e gConstruct a Venn Diagram. Make one group females and the other group heterosexual 


contact. 


Solution: 


The completed contingency table is as follows: 


IV 
Drug 
Homosexual/Bisexual User* 
Female 0 70 
Male 2146 463 
Totals 2146 533 


* includes homosexual/bisexual IV drug users 


e 
aa 
a 
w 
aD 


Exercise: 


Problem: 


Heterosexual 
Contact 


136 


60 


196 


Other 


49 


135 


184 


Totals 


255 


2804 


3059 


Solve these questions using probability rules. Do NOT use the contingency table above. 
3059 cases of AIDS had been reported in Santa Clara County, CA, through a certain date. 


Those cases will be our population. Of those cases, 6.4% obtained the disease through 


heterosexual contact and 7.4% are female. Out of the females with the disease, 53.3% got 


the disease from heterosexual contact. 


* aP(person is female) = 

¢ bP(person obtained the disease through heterosexual contact) = 

¢ cP(female GIVEN person got the disease from heterosexual contact) = 

e dConstruct a Venn Diagram. Make one group females and the other group heterosexual 
contact. Fill in all values as probabilities. 


Exercise: 


Problem: 


The following table identifies a group of children by one of four hair colors, and by type of 
hair. 


Hair Type Brown Blond Black Red Totals 
Wavy 20 15 3 43 
Straight 80 15 12 

Totals 20 215 


¢ aComplete the table above. 

e¢ bWhat is the probability that a randomly selected child will have wavy hair? 

e cWhat is the probability that a randomly selected child will have either brown or blond 
hair? 

e¢ dWhat is the probability that a randomly selected child will have wavy brown hair? 

e eWhat is the probability that a randomly selected child will have red hair, given that he 
has straight hair? 

e fif B is the event of a child having brown hair, find the probability of the complement 
of B. 

e gin words, what does the complement of B represent? 


Solution: 


Exercise: 


Problem: 


A previous year, the weights of the members of the San Francisco 49ers and the Dallas 
Cowboys were published in the San Jose Mercury News. The factual data are compiled into 
the following table. 


Shirt# < 210 211-250 251-290 290< 
1-33 21 iS) 0 0 
34-66 6 18 Z 4 
66-99 6 12 22 5 


For the following, suppose that you randomly select one player from the 49ers or Cowboys. 


e aFind the probability that his shirt number is from 1 to 33. 

e bFind the probability that he weighs at most 210 pounds. 

e cFind the probability that his shirt number is from 1 to 33 AND he weighs at most 210 
pounds. 

e dFind the probability that his shirt number is from 1 to 33 OR he weighs at most 210 
pounds. 

e eFind the probability that his shirt number is from 1 to 33 GIVEN that he weighs at 
most 210 pounds. 

e flf having a shirt number from 1 to 33 and weighing at most 210 pounds were 
independent events, then what should be true about P(Shirt# 1-33 | < 210 pounds)? 


Exercise: 


Problem: 


Approximately 281,000,000 people over age 5 live in the United States. Of these people, 
55,000,000 speak a language other than English at home. Of those who speak another 
language at home, 62.3% speak Spanish. (Source: 
http://www.census.gov/hhes/socdemo/language/data/acs/ACS-12.pdf) 


Let: F = speak English at home; E’ = speak another language at home; S = speak Spanish; 


Finish each probability statement by matching the correct answer. 


Probability Statements 
a. P(E’) = 

b. P(E) = 

c. P(S and E') = 


d. P(SIE') = 


Solution: 

e aiil 

e bi 

e civ 

e dii 
Exercise: 


Problem: 


Answers 


i. 0.8043 


ii. 0.623 


iii. 0.1957 


iv. 0.1219 


The probability that a male develops some form of cancer in his lifetime is 0.4567 (Source: 
American Cancer Society). The probability that a male has at least one false positive test 
result (meaning the test comes back for cancer when the man does not have it) is 0.51 
(Source: USA Today). Some of the questions below do not have enough information for you 
to answer them. Write “not enough information” for those answers. 


Let: C' = a man develops cancer in his lifetime; P = man has at least one false positive 


e aConstruct a tree diagram of the situation. 


* bP(C) = 
¢ cP(P|C) = 
¢ dP(P|C’ ) = 


e elf atest comes up positive, based upon numerical values, can you assume that man has 


cancer? Justify numerically and explain why or why not. 


Exercise: 


Problem: 


In 1994, the U.S. government held a lottery to issue 55,000 Green Cards (permits for non- 
citizens to work legally in the U.S.). Renate Deutsch, from Germany, was one of 
approximately 6.5 million people who entered this lottery. Let G = won Green Card. 


¢ aWhat was Renate’s chance of winning a Green Card? Write your answer as a 


probability statement. 


e bin the summer of 1994, Renate received a letter stating she was one of 110,000 
finalists chosen. Once the finalists were chosen, assuming that each finalist had an 
equal chance to win, what was Renate’s chance of winning a Green Card? Let 
F = was a finalist. Write your answer as a conditional probability statement. 

e cAre G and F independent or dependent events? Justify your answer numerically and 
also explain why. 

e dAre G and F mutually exclusive events? Justify your answer numerically and also 
explain why. 


Note:P.S. Amazingly, on 2/1/95, Renate learned that she would receive her Green Card -- 
true story! 


Solution: 


¢ a P(G) = 0.008 
e bO.5 

¢ cdependent 

e dNo 


Exercise: 


Problem: 


Three professors at George Washington University did an experiment to determine if 
economists are more selfish than other people. They dropped 64 stamped, addressed 
envelopes with $10 cash in different classrooms on the George Washington campus. 44% 
were returned overall. From the economics classes 56% of the envelopes were returned. 
From the business, psychology, and history classes 31% were returned. (Source: Wall Street 
Journal) 


Let: R = money returned; EF’ = economics classes; O = other classes 


e aWrite a probability statement for the overall percent of money returned. 

e bWrite a probability statement for the percent of money returned out of the economics 
classes. 

e cWrite a probability statement for the percent of money returned out of the other 
classes. 

e dis money being returned independent of the class? Justify your answer numerically 
and explain it. 

e eBased upon this study, do you think that economists are more selfish than other 
people? Explain why or why not. Include numbers to justify your answer. 


Exercise: 


Problem: 


The chart below gives the number of suicides estimated in the U.S. for a recent year by age, 
race (black and white), and sex. We are interested in possible relationships between age, 
race, and sex. We will let suicide victims be our population. (Source: The National Center 
for Health Statistics, U.S. Dept. of Health and Human Services) 


Race and Sex 1-14 15 - 24 25 - 64 over 64 TOTALS 
white, male 210 3360 13,610 22,050 
white, female 80 580 3380 4930 
black, male 10 460 1060 1670 
black, female 0 40 270 330 

all others 

TOTALS 310 4650 18,780 29,760 


Note:Do not include "all others" for parts (f), (g), and (i). 


aFill in the column for the suicides for individuals over age 64. 

bFill in the row for all other races. 

cFind the probability that a randomly selected individual was a white male. 

dFind the probability that a randomly selected individual was a black female. 

eFind the probability that a randomly selected individual was black 

fFind the probability that a randomly selected individual was male. 

gOut of the individuals over age 64, find the probability that a randomly selected 
individual was a black or white male. 

hComparing “Race and Sex” to “Age,” which two groups are mutually exclusive? How 
do you know? 

iAre being male and committing suicide over age 64 independent events? How do you 
know? 


Solution: 


22050 
29760 
330 


29760 
2000 
29760 

f 23720 

ie 

8 6020 
¢ hBlack females and ages 1-14 
e iNo 


The next two questions refer to the following: The percent of licensed U.S. drivers (from a 
recent year) that are female is 48.60. Of the females, 5.03% are age 19 and under; 81.36% are age 
20 - 64; 13.61% are age 65 or over. Of the licensed U.S. male drivers, 5.04% are age 19 and 
under; 81.43% are age 20 - 64; 13.53% are age 65 or over. (Source: Federal Highway 
Administration, U.S. Dept. of Transportation) 

Exercise: 


Problem: Complete the following: 


e aConstruct a table or a tree diagram of the situation. 

¢ bP(driver is female) = 

¢ cP(driver is age 65 or over | driver is female) = 

e dP(driver is age 65 or over AND female) = 

e elIn words, explain the difference between the probabilities in part (c) and part (d). 
¢ fP(driver is age 65 or over) = 


e gAre being age 65 or over and being female mutually exclusive events? How do you 
know 


Exercise: 


Problem: Suppose that 10,000 U.S. licensed drivers are randomly selected. 


e aHow many would you expect to be male? 

e bUsing the table or tree diagram from the previous exercise, construct a contingency 
table of gender versus age group. 

e cUsing the contingency table, find the probability that out of the age 20 - 64 group, a 
randomly selected driver is female. 


Solution: 


e a5140 
e c0.49 


Exercise: 


Problem: 


Approximately 86.5% of Americans commute to work by car, truck or van. Out of that 
group, 84.6% drive alone and 15.4% drive in a carpool. Approximately 3.9% walk to work 
and approximately 5.3% take public transportation. (Source: Bureau of the Census, U.S. 
Dept. of Commerce. Disregard rounding approximations.) 


e aConstruct a table or a tree diagram of the situation. Include a branch for all other 
modes of transportation to work. 

¢ bAssuming that the walkers walk alone, what percent of all commuters travel alone to 
work? 

¢ cSuppose that 1000 workers are randomly selected. How many would you expect to 
travel alone to work? 

e dSuppose that 1000 workers are randomly selected. How many would you expect to 
drive in a carpool? 


Exercise: 


Problem: Explain what is wrong with the following statements. Use complete sentences. 


¢ alf there’s a 60% chance of rain on Saturday and a 70% chance of rain on Sunday, then 
there’s a 130% chance of rain over the weekend. 

e bThe probability that a baseball player hits a home run is greater than the probability 
that he gets a successful hit. 


Try these multiple choice questions. 


The next two questions refer to the following probability tree diagram which shows tossing 
an unfair coin FOLLOWED BY drawing one bead from a cup containing 3 red (R), 4 yellow (Y 
) and 5 blue (B) beads. For the coin, P(H) = 2 and P(T') = + where H = ”heads” and 

T=" tails”. 


R 3/12 
Yan 
2/3 
H SB sn2 
R 
: 3/12 
1/3 
Y 4/12 


Exercise: 


Problem: Find P(tossing a Head on the coin AND a Red bead) 


e A 
° B 
°C 
e D 


w w loreal Wb 
slagjaalee 


Solution: 


C 


Exercise: 


Problem: Find P(Blue bead). 


eA 
- Be 
ec 2 
ops 


Solution: 
A 
The next three questions refer to the following table of data obtained from www.baseball- 


almanac.com showing hit information for 4 well known baseball players. Suppose that one hit 
from the table is randomly selected. 


NAME Single Double Triple Home Run TOTAL HITS 
Babe Ruth 1517 506 136 714 2873 
Jackie Robinson 1054 273 54 137 1518 
Ty Cobb 3603 174 295 114 4189 


Hank Aaron 2294 624 98 759 3771 


NAME Single Double Triple 


TOTAL 8471 1577 583 
Exercise: 


Problem: Find P(hit was made by Babe Ruth). 


° A ian 
oe 
© D a5357 


Solution: 


B 


Exercise: 


Home Run TOTAL HITS 


1720 12351 


Problem: Find P(hit was made by Ty Cobb | The hit was a Home Run) 


Solution: 


B 
Exercise: 


Problem: 


Are the hit being made by Hank Aaron and the hit being a double independent 


events? 


¢ A Yes, because P(hit by Hank Aaron | hit is a double) = P(hit by Hank Aaron) 
¢ BNo, because P(hit by Hank Aaron | hit is a double) 4 P(hit is a double) 


e C No, because 


P(hit is by Hank Aaron | hit is a double) ¢ P(hit by Hank Aaron) 
¢ D Yes, because P(hit is by Hank Aaron | hit is a double) = P(hit is a double) 


Solution: 


Cc 


Exercise: 


Problem: Given events G and H: P(G) = 0.43 ; P(H) = 0.26 ; P(H and G) = 0.14 


e A Find P(H or G) 
e BFind the probability of the complement of event (H and G) 
e CFind the probability of the complement of event (H or G) 


Solution: 


¢ A P(H orG) = P(H) + P(G) - P(H and G) = 0.26 + 0.43 - 0.14 = 0.55 
¢ BP( NOT (H andG) ) = 1 - P(HandG) = 1-0.14=0.86 
* C P( NOT (HorG)) =1- P(HorG)=1-0.55=0.45 


Exercise: 


Problem: Given events J and K: P(J) = 0.18 ; P(K) = 0.37 ; PJ or K) = 0.45 


e A Find P(J and k) 
e B Find the probability of the complement of event (J and K) 
¢ C Find the probability of the complement of event (J or K) 


Solution: 


e AP(J or K) = P(J) + P(K) - PU and K); 0.45 = 0.18 + 0.37 — PJ and K) ; solve to find 
P(J and K) = 0.10 

¢ BP( NOT (J and K) ) = 1 —- PV and K) = 1 - 0.10 = 0.90 

e C P( NOT (J or K) ) =1- PU or K) =1-0.45=0.55 


Exercise: 


Problem: 


United Blood Services is a blood bank that serves more than 500 hospitals in 18 states. 
According to their website, http://www.unitedbloodservices.org/humanbloodtypes.html, a 
person with type O blood and a negative Rh factor (Rh-) can donate blood to any person 
with any bloodtype. Their data show that 43% of people have type O blood and 15% of 
people have Rh- factor; 52% of people have type O or Rh- factor. 


e A Find the probability that a person has both type O blood and the Rh- factor 
e B Find the probability that a person does NOT have both type O blood and the Rh- 
factor. 


Solution: 


e A P(Type O or Rh-) = P(Type O) + P(Rh-) — P(Type O and Rh-) 
0.52 = 0.43 + 0.15 — P(Type O and Rh-); solve to find P(Type O and Rh-) = 0.06 
6% of people have type O Rh- blood 

¢ BP( NOT (Type O and Rh-) ) = 1 — P(Type O and Rh-) = 1 — 0.06 = 0.94 
94% of people do not have type O Rh- blood 


Exercise: 
Problem: 
At a college, 72% of courses have final exams and 46% of courses require research papers. 


Suppose that 32% of courses have a research paper and a final exam. Let F be the event that 
a course has a final exam. Let R be the event that a course requires a research paper. 


e A Find the probability that a course has a final exam or a research project. 
e B Find the probability that a course has NEITHER of these two requirements. 


Solution: 


e A P(R or F) = P(R) + P(F) — P(R and F) = 0.72 + 0.46 — 0.32 = 0.86 
¢ B P( Neither R nor F ) = 1 - P(Ror F) =1-0.86=0.14 


Exercise: 
Problem: 


In a box of assorted cookies, 36% contain chocolate and 12% contain nuts. Of those, 8% 
contain both chocolate and nuts. Sean is allergic to both chocolate and nuts. 


e A Find the probability that a cookie contains chocolate or nuts (he can't eat it). 
e B Find the probability that a cookie does not contain chocolate or nuts (he can eat it). 


Solution: 


e Let C be the event that the cookie contains chocolate. Let N be the event that the cookie 
contains nuts. 

e A P(C or N) = P(C) + P(N) — P(C and N) = 0.36 + 0.12 — 0.08 = 0.40 

¢ B P( neither chocolate nor nuts) = 1 — P(C or N) = 1 - 0.40 = 0.60 


Exercise: 


Problem: 


A college finds that 10% of students have taken a distance learning class and that 40% of 
students are part time students. Of the part time students, 20% have taken a distance learning 
class. Let D = event that a student takes a distance learning class and E = event that a student 
is a part time student 


A Find P(D and E) 

B Find P(E | D) 

C Find P(D or E) 

¢ D Using an appropriate test, show whether D and E are independent. 

e E Using an appropriate test, show whether D and E are mutually exclusive. 


Solution: 


¢ A P(D and E) = P(D|E)P(E) = (0.20)(0.40) = 0.08 

B P(E|D) = P(D and E) / P(D) = 0.08/0.10 = 0.80 

C P(D or E) = P(D) + P(E) — P(D and E) = 0.10 + 0.40 — 0.08 = 0.42 

¢ D Not Independent: P(D|E) = 0.20 which does not equal P(D) = .10 

e E Not Mutually Exclusive: P(D and E) = 0.08 ; if they were mutually exclusive then we 
would need to have P(D and E) = 0, which is not true here. 


Exercise: 


Problem: 


When the Euro coin was introduced in 2002, two math professors had their statistics 
students test whether the Belgian 1 Euro coin was a fair coin. They spun the coin rather than 
tossing it, and it was found that out of 250 spins, 140 showed a head (event H) while 110 
showed a tail (event T). Therefore, they claim that this is not a fair coin. 


e A Based on the data above, find P(H) and P(T). 

¢ B Use a tree to find the probabilities of each possible outcome for the experiment of 
tossing the coin twice. 

e C Use the tree to find the probability of obtaining exactly one head in two tosses of the 
coin. 

¢ D Use the tree to find the probability of obtaining at least one head. 


Solution: 


¢ A P(H) = 140/250; P(T) = 110/250 
° €308/625 
° D 504/625 


Exercise: 
Problem: 
A box of cookies contains 3 chocolate and 7 butter cookies. Miguel randomly selects a 


cookie and eats it. Then he randomly selects another cookie and eats it also. (How many 
cookies did he take?) 


e A Draw the tree that represents the possibilities for the cookie selections. Write the 
probabilities along each branch of the tree. 


¢ B Are the probabilities for the flavor of the SECOND cookie that Miguel selects 
independent of his first selection? Explain. 

e C For each complete path through the tree, write the event it represents and find the 
probabilities. 

¢ DLet S be the event that both cookies selected were the same flavor. Find P(S). 

e E Let T be the event that both cookies selected were different flavors. Find P(T) by two 
different methods: by using the complement rule and by using the branches of the tree. 
Your answers should be the same with both methods. 

e F Let U be the event that the second cookie selected is a butter cookie. Find P(U). 


**Exercises 33 - 40 contributed by Roberta Bloom 


Review 
This module provides a number of homework/review exercises related to 
Probability. 


The first six exercises refer to the following study: In a survey of 100 
stocks on NASDAQ, the average percent increase for the past year was 9% 
for NASDAQ stocks. Answer the following: 

Exercise: 


Problem: The “average increase” for all NASDAQ stocks is the: 
e APopulation 
e BStatistic 
e CParameter 
¢ DSample 
e EVariable 


Solution: 


e C Parameter 
Exercise: 


Problem: All of the NASDAQ stocks are the: 


e APopulation 
e BStatistic 
e CParameter 
¢ DSample 
e EVariable 


Solution: 


e A Population 


Exercise: 


Problem: 9% is the: 
e APopulation 
e BStatistic 
e CParameter 
¢ DSample 
e EVariable 


Solution: 


e B Statistic 
Exercise: 


Problem: The 100 NASDAQ stocks in the survey are the: 
e APopulation 
e BStatistic 
e CParameter 
¢ DSample 
e EVariable 


Solution: 


¢ D Sample 
Exercise: 


Problem: The percent increase for one stock in the survey is the: 


e APopulation 
e BStatistic 


e CParameter 
¢ DSample 
e EVariable 
Solution: 
e E Variable 
Exercise: 


Problem: 


Would the data collected be qualitative, quantitative — discrete, or 
quantitative — continuous? 


Solution: 


quantitative - continuous 


The next two questions refer to the following study: Thirty people spent 
two weeks around Mardi Gras in New Orleans. Their two-week weight gain 
is below. (Note: a loss is shown by a negative weight gain.) 


Weight Gain Frequency 
-2 3 
-1 is) 


0 2 


Weight Gain Frequency 


1 4 

4 13 

6 2 

11 1 
Exercise: 


Problem: Calculate the following values: 
e aThe average weight gain for the two weeks 
e bThe standard deviation 
e cThe first, second, and third quartiles 
Solution: 
e a2.27 


° b3.04 
° c-1,4,4 


Exercise: 


Problem: Construct a histogram and a boxplot of the data. 


Discrete Random Variables 
This module serves as the introduction to Discrete Random Variables in the 
Elementary Statistics textbook/collection. 


Student Learning Outcomes 
By the end of this chapter, the student should be able to: 


¢ Recognize and understand discrete probability distribution functions, 
in general. 

e Calculate and interpret expected values. 

e Recognize the binomial probability distribution and apply it 
appropriately. 

e Recognize the Poisson probability distribution and apply it 
appropriately (optional). 

e Recognize the geometric probability distribution and apply it 
appropriately (optional). 

e Recognize the hypergeometric probability distribution and apply it 
appropriately (optional). 

e Classify discrete word problems by their distributions. 


Introduction 


A student takes a 10 question true-false quiz. Because the student had such 
a busy schedule, he or she could not study and randomly guesses at each 
answer. What is the probability of the student passing the test with at least a 
70%? 


Small companies might be interested in the number of long distance phone 
calls their employees make during the peak time of the day. Suppose the 
average is 20 calls. What is the probability that the employees make more 
than 20 long distance phone calls during the peak time? 


These two examples illustrate two different types of probability problems 
involving discrete random variables. Recall that discrete data are data that 
you can count. A random variable describes the outcomes of a statistical 


experiment in words. The values of a random variable can vary with each 
repetition of an experiment. 


In this chapter, you will study probability problems involving discrete 
random distributions. You will also study long-term averages associated 
with them. 


Random Variable Notation 


Upper case letters like X or Y denote a random variable. Lower case letters 
like x or y denote the value of a random variable. If X is a random 
variable, then X is written in words. and z is given as a number. 


For example, let X = the number of heads you get when you toss three fair 
coins. The sample space for the toss of three fair coins is TTT THH HTH 
HHT HTT THT TTH HHH. Then, zx = 0, 1, 2, 3. X is in words and z is 
a number. Notice that for this example, the x values are countable 
outcomes. Because you can count the possible values that X can take on 
and the outcomes are random (the z values 0, 1, 2, 3), X is a discrete 
random variable. 


Optional Collaborative Classroom Activity 


Toss a coin 10 times and record the number of heads. After all members of 
the class have completed the experiment (tossed a coin 10 times and 
counted the number of heads), fill in the chart using a heading like the one 
below. Let X = the number of heads in 10 tosses of the coin. 


x Frequency of x Relative Frequency of x 


x Frequency of x Relative Frequency of x 


e Which value(s) of x occurred most frequently? 

e If you tossed the coin 1,000 times, what values could x take on? 
Which value(s) of z do you think would occur most frequently? 

e What does the relative frequency column sum to? 


Glossary 


Random Variable (RV) 
see Variable 


Variable (Random Variable) 
A characteristic of interest in a population being studied. Common 
notation for variables are upper case Latin letters X, Y, Z,...; common 
notation for a specific value from the domain (set of all possible values 
of a variable) are lower case Latin letters x, y, z,.... For example, if X 
is the number of children in a family, then x represents a specific 
integer 0, 1, 2, 3, .... Variables in statistics differ from variables in 
intermediate algebra in two following ways. 


e The domain of the random variable (RV) is not necessarily a 
numerical set; the domain may be expressed in words; for 
example, if X = hair color then the domain is {black, blond, gray, 
green, orange}. 

e We can tell what specific value xz of the Random Variable X takes 
only after performing the experiment. 


Probability Distribution Function (PDF) for a Discrete Random Variable 
This module introduces the Probability Distribution Function (PDF) and its 
characteristics. 


A discrete probability distribution function has two characteristics: 


e Each probability is between 0 and 1, inclusive. 
e The sum of the probabilities is 1. 


Example: 

A child psychologist is interested in the number of times a newborn baby's 
crying wakes its mother after midnight. For a random sample of 50 
mothers, the following information was obtained. Let _X = the number of 
times a newborn wakes its mother after midnight. For this example, x = 0, 
| Pare ac. ey 

P(x) = probability that X takes on a value z. 


x P(x) 

0 ree) = = 
1 P(ix=1) = _ 
2 x2) = 
3 Puss) = = 
4 Pix=4)— = 


P(x=5) = 4 


X takes on the values 0, 1, 2, 3, 4, 5. This is a discrete PDF because 


1. Each P(x) is between 0 and 1, inclusive. 
2. The sum of the probabilities is 1, that is, 


Equation: 


Zp lies eat a es 
50 50 50 50 50 50, 


Example: 


Suppose Nancy has classes 3 days a week. She attends classes 3 days a 
week 80% of the time, 2 days 15% of the time, 1 day 4% of the time, and 


no days 1% of the time. Suppose one week is randomly selected. 
Exercise: 


Problem: 

Let X = the number of days Nancy 

Solution: 

Let X = the number of days Nancy attends class per week. 
Exercise: 

Problem: X takes on what values? 

Solution: 


Oo. 2.-and:s 


Exercise: 


Problem: 


Suppose one week is randomly chosen. Construct a probability 
distribution table (called a PDF table) like the one in the previous 
example. The table should have two columns labeled x and P(x). 
What does the P(x) column sum to? 


Solution: 
x P(x) 
0 0.01 
1 0.04 
2 0.15 
3 0.80 
Glossary 


Probability Distribution Function (PDF) 
A mathematical description of a discrete random variable (RV), given 
either in the form of an equation (formula) , or in the form of a table 
listing all the possible outcomes of an experiment and the probability 
associated with each outcome. 


Example: 

A biased coin with probability 0.7 for a head (in one toss of the coin) is 
tossed 5 times. We are interested in the number of heads (the RV X = the 
number of heads). X is Binomial, so X ~ B(5,0.7) and P(X = x) = 


5 
_7°.3°-*or in the form of the table: 


x 
L P(x = 2) 
0 0.0024 
1 0.0284 
2 0.1323 
3 0.3087 
4 0.3602 


Mean or Expected Value and Standard Deviation 

This module explores the Law of Large Numbers, the phenomenon where 
an experiment performed many times will yield cumulative results closer 
and closer to the theoretical mean over time. 


The expected value is often referred to as the "long-term" average or 
mean . This means that over the long term of doing an experiment over and 
over, you would expect this average. 


The mean of a random variable X is ps. If we do an experiment many times 
(for instance, flip a fair coin, as Karl Pearson did, 24,000 times and let X = 
the number of heads) and record the value of X each time, the average is 
likely to get closer and closer to yz as we keep repeating the experiment. 
This is known as the Law of Large Numbers. 


Note:To find the expected value or long term average, jz, simply multiply 
each value of the random variable by its probability and add the products. 


A Step-by-Step Example 

A men's soccer team plays soccer 0, 1, or 2 days a week. The probability 
that they play O days is 0.2, the probability that they play 1 day is 0.5, and 
the probability that they play 2 days is 0.3. Find the long-term average, jp, 
or expected value of the days per week the men's soccer team plays soccer. 


To do the problem, first let the random variable X = the number of days the 
men's soccer team plays soccer per week. X takes on the values 0, 1, 2. 
Construct a PDF table, adding a column xP(x). In this column, you will 
multiply each z value by its probability. 


x P(x) tP(x) 


0 0.2 (0)(0.2) = 0 
1 0.5 (1)(0.5) = 0.5 
2 0.3 (2)(0.3) = 0.6 


Expected Value TableThis table is called an expected value table. The table 
helps you calculate the expected value or long-term average. 


Add the last column to find the long term average or expected value: 
(0)(0.2)+(1)(0.5)+(2)(0.3)= 0 + 0.5 + 0.6 = 1.1. 


The expected value is 1.1. The men's soccer team would, on the average, 
expect to play soccer 1.1 days per week. The number 1.1 is the long term 
average or expected value if the men's soccer team plays soccer week after 
week after week. We say H=1.1 


Example: 

Find the expected value for the example about the number of times a 
newborn baby's crying wakes its mother after midnight. The expected 
value is the expected number of times a newborn wakes its mother after 
midnight. 


2 P(X) xP(X 


—_—— 


0 P(x=0) = (0)(s5) =0 


1 P(x=1) = 5 (D(a) = 3 
2 P(x=2) = #3 (2)(3>) = 3 
3 PE) = a (3)(s5) = 3 
4 Ria (4)(4)=-4 
5 Poo, 6)(a)=4 


You expect a newborn to wake its mother after midnight 2.1 times, on the 
average. 


Add the last column to find the expected value. jz = Expected Value = 
105 __ 

9 = 2-1 

Exercise: 


Problem: 


Go back and calculate the expected value for the number of days 
Nancy attends classes a week. Construct the third column to do so. 


Solution: 


2.74 days a week. 


Example: 

Suppose you play a game of chance in which five numbers are chosen 
from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. A computer randomly selects five numbers 
from 0 to 9 with replacement. You pay $2 to play and could profit 
$100,000 if you match all 5 numbers in order (you get your $2 back plus 


$100,000). Over the long term, what is your expected profit of playing the 
game? 

To do this problem, set up an expected value table for the amount of money 
you can profit. 

Let X = the amount of money you profit. The values of x are not 0, 1, 2, 3, 
4, 5, 6, 7, 8, 9. Since you are interested in your profit (or loss), the values 
of x are 100,000 dollars and -2 dollars. 

To win, you must get all 5 numbers correct, in order. The probability of 
choosing one correct number is ay because there are 10 numbers. You may 
choose a number more than once. The probability of choosing all 5 
numbers correctly and in order is: 

Equation: 


—_*__* __* ___*__*_ 1*19~° = 0.00001 
10 10 10 10 10 ees 


Therefore, the probability of winning is 0.00001 and the probability of 
losing is 
Equation: 


1 — 0.00001 = 0.99999 


The expected value table is as follows. 


x P(x) xzP(x) 
Loss 2 0.99999 (-2)(0.99999)=-1.99998 
Profit 100,000 0.00001 (100000)(0.00001)=1 


Add the last column. -1.99998 + 1 = -0.99998 


Since —0.99998 is about —1, you would, on the average, expect to lose 
approximately one dollar for each game you play. However, each time you 
play, you either lose $2 or profit $100,000. The $1 is the average or 
expected LOSS per game after playing this game over and over. 


Example: 
Suppose you play a game with a biased coin. You play each game by 
tossing the coin once. P(heads) = 3 and P(tails) = 4. If you toss a 


head, you pay $6. If you toss a tail, you win $10. If you play this game 
many times, will you come out ahead? 
Exercise: 


Problem: Define a random variable X. 


Solution: 


X = amount of profit 


Exercise: 


Problem: Complete the following expected value table. 


eo] 


WIN 10 


LOSE 


Solution: 


x P(x) xzP(x) 
WIN 10 + 2 
LOSE -6 2 


Exercise: 


Problem: What is the expected value, ? Do you come out ahead? 


Solution: 


Add the last column of the table. The expected value pz = =. You 


lose, on average, about 67 cents each time you play the game so you 
do not come out ahead. 


Like data, probability distributions have standard deviations. To calculate 
the standard deviation (@) of a probability distribution, find each deviation 
from its expected value, square it, multiply it by its probability, add the 
products, and take the square root . To understand how to do the calculation, 
look at the table for the number of days per week a men's soccer team plays 
soccer. To find the standard deviation, add the entries in the column labeled 


(x — p)?- P(z) and take the square root. 


z P(x) aP(x) (x -p)°P (x) 


0 | o2 (0)(0.2) = 0 (0 — 1.1) (.2) ~ 0.242 
1 05 (1)(0.5) = 0.5 Geta)? (.5) = 0.005 
2 | 03 (2)(0.3) = 0.6 44) (.3) ~ 0.243 


Add the last column in the table. 0.242 + 0.005 + 0.243 = 0.490. The 
standard deviation is the square root of 0.49. 0 = / 0.49 — 0.7 


Generally for probability distributions, we use a calculator or a computer to 
calculate 4 and o to reduce roundoff error. For some probability 
distributions, there are short-cut formulas that calculate jz and o. 


Glossary 


Expected Value 
Expected arithmetic average when an experiment is repeated many 
times. (Also called the mean). Notations: F(a), jw. For a discrete 
random variable (RV) with probability distribution function P(a),the 
definition can also be written in the form E(x) = uw = 5) xP(z). 


Mean 
A number that measures the central tendency. A common name for 
mean is ‘average.’ The term 'mean' is a shortened form of ‘arithmetic 
mean.’ By definition, the mean for a sample (denoted by 2) is 
__ Sum of all values in the sample dth i lati 
Lor Number of values in the sample ’ at € mean for a population 
Sum of all values in the population 
Number of values in the population ° 


(denoted by 4) is w = 


Common Discrete Probability Distribution Functions 

This module serves as a lead-in for several types of common discrete 
probability distribution functions, including binomial, geometric, 
hypergeometric, and Poisson. 


Some of the more common discrete probability functions are binomial, 
geometric, hypergeometric, and Poisson. Most elementary courses do not 
cover the geometric, hypergeometric, and Poisson. Your instructor will let 
you know if he or she wishes to cover these distributions. 


A probability distribution function is a pattern. You try to fit a probability 
problem into a pattern or distribution in order to perform the necessary 
calculations. These distributions are tools to make solving probability 
problems easier. Each distribution has its own special characteristics. 
Learning the characteristics enables you to distinguish among the different 
distributions. 


Binomial 
This module describes the characteristics of a binomial experiment and the 
binomial probability distribution function. 


The characteristics of a binomial experiment are: 


1. 


Z. 


There are a fixed number of trials. Think of trials as repetitions of an 
experiment. The letter m denotes the number of trials. 

There are only 2 possible outcomes, called "success" and, "failure" for 
each trial. The letter p denotes the probability of a success on one trial 
and q denotes the probability of a failure on one trial. p + q = 1. 


. The 7 trials are independent and are repeated using identical 


conditions. Because the n trials are independent, the outcome of one 
trial does not help in predicting the outcome of another trial. Another 
way of saying this is that for each individual trial, the probability, p, of 
a success and probability, g, of a failure remain the same. For example, 
randomly guessing at a true - false statistics question has only two 
outcomes. If a success is guessing correctly, then a failure is guessing 
incorrectly. Suppose Joe always guesses correctly on any statistics true 
- false question with probability p = 0.6. Then, q = 0.4 .This means 
that for every true - false statistics question Joe answers, his 
probability of success (p = 0.6) and his probability of failure (q = 0.4 
) remain the same. 


The outcomes of a binomial experiment fit a binomial probability 
distribution. The random variable X = the number of successes obtained 
in the n independent trials. 


The mean, jz, and variance, o”, for the binomial probability distribution is 
y. = np and o? = npq. The standard deviation, a, is then o = ,/npq. 


Any experiment that has characteristics 2 and 3 and where n = 1 is calleda 
Bernoulli Trial (named after Jacob Bernoulli who, in the late 1600s, 
studied them extensively). A binomial experiment takes place when the 
number of successes is counted in one or more Bernoulli Trials. 


Example: 

At ABC College, the withdrawal rate from an elementary physics course is 
30% for any given term. This implies that, for any given term, 70% of the 
students stay in the class for the entire term. A "success" could be defined 
as an individual who withdrew. The random variable is X = the number of 
students who withdraw from the randomly selected elementary physics 
class. 


Example: 

Suppose you play a game that you can only either win or lose. The 
probability that you win any game is 55% and the probability that you lose 
is 45%. Each game you play is independent. If you play the game 20 times, 
what is the probability that you win 15 of the 20 games? Here, if you 
define X = the number of wins, then X takes on the values 0, 1, 2, 3, ..., 
20. The probability of a success is p = 0.55. The probability of a failure is 
q = 0.45. The number of trials ism = 20. The probability question can be 
stated mathematically as P(a = 15). 


Example: 

A fair coin is flipped 15 times. Each flip is independent. What is the 
probability of getting more than 10 heads? Let X = the number of heads in 
15 flips of the fair coin. X takes on the values 0, 1, 2, 3, ..., 15. Since the 
coin is fair, p = 0.5 and q = 0.5. The number of trials is n = 15. The 
probability question can be stated mathematically as P(a > 10). 


Example: 

Approximately 70% of statistics students do their homework in time for it 
to be collected and graded. Each student does homework independently. In 
a Statistics class of 50 students, what is the probability that at least 40 will 
do their homework on time? Students are selected randomly. 

Exercise: 


Problem: 


This is a binomial problem because there is only a success or a 
, there are a definite number of trials, and the probability 
of a success is 0.70 for each trial. 


Solution: 


failure 
Exercise: 


Problem: 


If we are interested in the number of students who do their homework, 
then how do we define X? 


Solution: 

X =the number of statistics students who do their homework on time 
Exercise: 

Problem: What values does zx take on? 


Solution: 
Ole 0) 
Exercise: 


Problem: What is a "failure", in words? 


Solution: 


Failure is a student who does not do his or her homework on time. 


The probability of a success is p = 0.70. The number of trial is n = 50. 
Exercise: 


Problem: If p + gq = 1, then what is q? 
Solution: 


gq = 0.30 
Exercise: 
Problem: 


The words "at least" translate as what kind of inequality for the 
probability question P(a____40). 


Solution: 


greater than or equal to (=) 


The probability question is P(a > 40). 


Notation for the Binomial: B = Binomial Probability 
Distribution Function 


X ~ B(n,p) 


Read this as ".X is arandom variable with a binomial distribution." The 
parameters are n and p. n = number of trials p = probability of a success on 
each trial 


Example: 

It has been stated that about 41% of adult workers have a high school 
diploma but do not pursue any further education. If 20 adult workers are 
randomly selected, find the probability that at most 12 of them have a high 
school diploma but do not pursue any further education. How many adult 


workers do you expect to have a high school diploma but do not pursue 
any further education? 

Let X = the number of workers who have a high school diploma but do not 
pursue any further education. 

X takes on the values 0, 1, 2, ..., 20 where n = 20 and p = 0.41. q=1- 
0.41 = 0.59. X ~ B(20, 0.41) 

Find P(x < 12). P(a < 12) = 0.9738. (calculator or computer) 

Using the TI-83+ or the TI-84 calculators, the calculations are as follows. 
Go into 2nd DISTR. The syntax for the instructions are 

To calculate (x = value): binompdf(n, p, number) If "number" is left out, 
the result is the binomial probability table. 

To calculate P(x < value): binomcdf(n, p, number) If "number" is left 
out, the result is the cumulative binomial probability table. 

For this problem: After you are in 2nd DISTR, arrow down to 
binomcdf. Press ENTER. Enter 20,.41,12). The result is 

(12 O88: 


Note:If you want to find P(a = 12), use the pdf (binompdf). If you want 
to find P(x>12), use 1 - binomcdf(20,.41,12). 


The probability at most 12 workers have a high school diploma but do not 
pursue any further education is 0.9738 
The graph of x ~ B(20, 0.41) is: 


0123 4 Sires» 20 


The y-axis contains the probability of 2, where X = the number of workers 
who have only a high school diploma. 

The number of adult workers that you expect to have a high school 
diploma but not pursue any further education is the mean, 

fe — np —"(20)( OAT) 62; 

The formula for the variance is o“ = npq. The standard deviation is 

o = ./upq. o = »/(20)(0.41)(0.59) = 2.20. 
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Example: 

The following example illustrates a problem that is not binomial. It 
violates the condition of independence. ABC College has a student 
advisory committee made up of 10 staff members and 6 students. The 
committee wishes to choose a chairperson and a recorder. What is the 
probability that the chairperson and recorder are both students? All names 
of the committee are put into a box and two names are drawn without 
replacement. The first name drawn determines the chairperson and the 
second name the recorder. There are two trials. However, the trials are not 
independent because the outcome of the first trial affects the outcome of 
the second trial. The probability of a student on the first draw is +. The 


probability of a student on the second draw is — when the first draw 
produces a student. The probability is a when the first draw produces a 


staff member. The probability of drawing a student's name changes for 
each of the trials and, therefore, violates the condition of independence. 


Glossary 


Bernoulli Trials 
An experiment with the following characteristics: 


e There are only 2 possible outcomes called “success” and “failure” 
for each trial. 

e The probability p of a success is the same for any trial (so the 
probability g = 1 — p of a failure is the same for any trial). 


Binomial Distribution 
A discrete random variable (RV) which arises from Bernoulli trials. 
There are a fixed number, n, of independent trials. “Independent” 
means that the result of any trial (for example, trial 1) does not affect 
the results of the following trials, and all trials are conducted under the 
same conditions. Under these circumstances the binomial RV X is 
defined as the number of successes in n trials. The notation is: X~ 
B(n, p). The mean is js = np and the standard deviation is o = ,/npq 
. The probability of exactly z successes in 7 trials is 


P(X = 2) = (2 )p*q"*. 


Geometric (optional) 

This module describes the geometric experiment and the geometric 
probability distribution. This module is included in the Collaborative 
Statistics textbook/collection as an optional lesson. 


The characteristics of a geometric experiment are: 


1. There are one or more Bernoulli trials with all failures except the last 
one, which is a success. In other words, you keep repeating what you 
are doing until the first success. Then you stop. For example, you 
throw a dart at a bull's eye until you hit the bull's eye. The first time 
you hit the bull's eye is a "success" so you stop throwing the dart. It 
might take you 6 tries until you hit the bull's eye. You can think of the 
trials as failure, failure, failure, failure, failure, success. STOP. 

. In theory, the number of trials could go on forever. There must be at 
least one trial. 

. The probability,p, of a success and the probability, qg, of a failure is the 
same for each trial. p + gq = 1 and q = 1 — p. For example, the 
probability of rolling a 3 when you throw one fair die is <- This is true 
no matter how many times you roll the die. Suppose you want to know 
the probability of getting the first 3 on the fifth roll. On rolls 1, 2, 3, 
and 4, you do not get a face with a 3. The probability for each of rolls 
1, 2, 3, and 4 is gq = 2, the probability of a failure. The probability of 


getting a 3 on the fifth roll is 2 . 2 . 2. : 2 : + = 0.0804 


X = the number of independent trials until the first success. The mean and 


variance are in the summary in this chapter. 


Example: 
You play a game of chance that you can either win or lose (there are no 


other possibilities) until you lose. Your probability of losing is p = 0.57. 


What is the probability that it takes 5 games until you lose? Let X = the 


number of games you play until you lose (includes the losing game). Then 
X takes on the values 1, 2, 3, ... (could go on indefinitely). The probability 


question is P(x = 5). 


Example: 

A safety engineer feels that 35% of all industrial accidents in her plant are 
caused by failure of employees to follow instructions. She decides to look 
at the accident reports (selected randomly and replaced in the pile after 
reading) until she finds one that shows an accident caused by failure of 
employees to follow instructions. On the average, how many reports would 
the safety engineer expect to look at until she finds a report showing an 
accident caused by employee failure to follow instructions? What is the 
probability that the safety engineer will have to examine at least 3 reports 
until she finds a report showing an accident caused by employee failure to 
follow instructions? 

Let X = the number of accidents the safety engineer must examine until 
she finds a report showing an accident caused by employee failure to 
follow instructions. X takes on the values 1, 2, 3, .... The first question 
asks you to find the expected value or the mean. The second question asks 
you to find P(x > 3). ("At least" translates as a "greater than or equal to" 
symbol). 


Example: 

Suppose that you are looking for a student at your college who lives within 
five miles of you. You know that 55% of the 25,000 students do live within 
five miles of you. You randomly contact students from the college until 
one says he/she lives within five miles of you. What is the probability that 
you need to contact four people? 

This is a geometric problem because you may have a number of failures 
before you have the one success you desire. Also, the probability of a 
success stays the same each time you ask a student if he/she lives within 
five miles of you. There is no definite number of trials (number of times 
you ask a student). 

Exercise: 


Problem: 


Let X = the number of you must ask 
one says yes. 


Solution: 


Let X = the number of students you must ask until one says yes. 


Exercise: 


Problem: What values does X take on? 


Solution: 


1, 2, 3, ..., (total number of students) 


Exercise: 


Problem: What are p and q? 


Solution: 
e p=0.55 
e g=0.45 
Exercise: 
Problem: The probability question is P( ). 
Solution: 
Pi) 


Notation for the Geometric: G = Geometric Probability 
Distribution Function 


X*~ Gp) 


Read this as "X is arandom variable with a geometric distribution." The 
parameter is p. p = the probability of a success for each trial. 


Example: 

Assume that the probability of a defective computer component is 0.02. 
Components are randomly selected. Find the probability that the first 
defect is caused by the 7th component tested. How many components do 
you expect to test until one is found to be defective? 

Let X = the number of computer components tested until the first defect is 
found. 

X takes on the values 1, 2, 3, ... where p = 0.02. X ~ G(0.02) 

Find P(x = 7). P(a = 7) = 0.0177. (calculator or computer) 

TI-83+ and TI-84: For a general discussion, see this example (binomial). 
The syntax is similar. The geometric parameter list is (p, number) If 
"number" is left out, the result is the geometric probability table. For this 
problem: After you are in 2nd DISTR, arrow down to D:geometpdf. 
Press ENTER. Enter .02,7). The result is P(x = 7) = 0.0177. 

The probability that the 7th component is the first defect is 0.0177. 

The graph of X ~ G(0.02) is: 


P(X=x) 


0.01 
0.005 
0 


The y-axis contains the probability of x, where X = the number of 
computer components tested. 
The number of components that you would expect to test until you find the 
first defective one is the mean, jz = 50. 

1 


The formula for the mean is wy = - = 00 = 50 


The formula for the variance is 


oP = 1. (1-1) = gy: (gy - 1) = 2480 


The standard deviation is 


/ 1 1 1 1 


Glossary 


Geometric Distribution 
A discrete random variable (RV) which arises from the Bernoulli trials. 
The trials are repeated until the first success. The geometric variable X 
is defined as the number of trials until the first success. Notation: X ~ 
G(p). The mean is pw = . and the standard deviation is 


C= > : (+ = 1) The probability of exactly x failures before the 


first success is given by the formula: P(X = x) = p(1— p)?"1. 


Hypergeometric (optional) 

This module describes the properties of a hypergeometric experiment and 
hypergeometric probability distribution. This module is included in the 
Collaborative Statistics textbook/collection as an optional lesson. 


The characteristics of a hypergeometric experiment are: 


1. You take samples from 2 groups. 

2. You are concerned with a group of interest, called the first group. 

3. You sample without replacement from the combined groups. For 
example, you want to choose a softball team from a combined group of 
11 men and 13 women. The team consists of 10 players. 

4. Each pick is not independent, since sampling is without replacement. 
In the softball example, the probability of picking a women first is oe 

The probability of picking a man second is sy if a woman was picked 

first. It is iy if a man was picked first. The probability of the second 


pick depends on what happened in the first pick. 
5. You are not dealing with Bernoulli Trials. 


The outcomes of a hypergeometric experiment fit a hypergeometric 
probability distribution. The random variable X = the number of items 
from the group of interest. The mean and variance are given in the 
summary. 


Example: 

A candy dish contains 100 jelly beans and 80 gumdrops. Fifty candies are 
picked at random. What is the probability that 35 of the 50 are gumdrops? 
The two groups are jelly beans and gumdrops. Since the probability 
question asks for the probability of picking gumdrops, the group of interest 
(first group) is gumdrops. The size of the group of interest (first group) is 
80. The size of the second group is 100. The size of the sample is 50 (jelly 
beans or gumdrops). Let X = the number of gumdrops in the sample of 50. 
X takes on the values x = 0, 1, 2, ..., 50. The probability question is 

VG 85), 


Example: 

Suppose a shipment of 100 VCRs is known to have 10 defective VCRs. An 
inspector randomly chooses 12 for inspection. He is interested in 
determining the probability that, among the 12, at most 2 are defective. 
The two groups are the 90 non-defective VCRs and the 10 defective VCRs. 
The group of interest (first group) is the defective group because the 
probability question asks for the probability of at most 2 defective VCRs. 
The size of the sample is 12 VCRs. (They may be non-defective or 
defective.) Let X = the number of defective VCRs in the sample of 12. X 
takes on the values 0, 1, 2, ..., 10. X may not take on the values 11 or 12. 
The sample size is 12, but there are only 10 defective VCRs. The inspector 
wants to know P(# < 2) ("At most" means "less than or equal to"). 


Example: 

You are president of an on-campus special events organization. You need a 
committee of 7 to plan a special birthday party for the president of the 
college. Your organization consists of 18 women and 15 men. You are 
interested in the number of men on your committee. If the members of the 
committee are randomly selected, what is the probability that your 
committee has more than 4 men? 

This is a hypergeometric problem because you are choosing your 
committee from two groups (men and women). 

Exercise: 


Problem: Are you choosing with or without replacement? 


Solution: 


Without 


Exercise: 


Problem: What is the group of interest? 


Solution: 


The men 

Exercise: 
Problem: How many are in the group of interest? 
Solution: 
15 men 

Exercise: 
Problem: How many are in the other group? 
Solution: 


18 women 
Exercise: 


Problem: 

Let X = on the committee. What values does X take on? 

Solution: 

Let X = the number of men on the committee. x = 0, 1, 2, ..., 7. 
Exercise: 

Problem:The probability question is P(___). 


Solution: 


P(x>4) 


Notation for the Hypergeometric: H = Hypergeometric 
Probability Distribution Function 


X~H(r, b, n) 


Read this as ".X is arandom variable with a hypergeometric distribution." 
The parameters are r, b, and n. r = the size of the group of interest (first 
group), b = the size of the second group, n = the size of the chosen sample 


Example: 

A school site committee is to be chosen randomly from 6 men and 5 
women. If the committee consists of 4 members chosen randomly, what is 
the probability that 2 of them are men? How many men do you expect to 
be on the committee? 

Let X = the number of men on the committee of 4. The men are the group 
of interest (first group). 

X takes on the values 0, 1, 2, 3, 4, where r = 6,6 = 5,andn= 4. 
X~H(6, 5, 4) 

Find P(x = 2). P(x = 2) = 0.4545 (calculator or computer) 


Note:Currently, the TI-83+ and TI-84 do not have hypergeometric 
probability functions. There are a number of computer packages, 
including Microsoft Excel, that do. 


The probability that there are 2 men on the committee is about 0.45. 
The graph of X~H(6, 5, 4) is: 


P(X=x) 


The y-axis contains the probability of X, where X = the number of men on 


the committee. 
You would expect m = 2.18 (about 2) men on the committee. 


The formula for the mean is uw = 77> = oe =218 
The formula for the variance is fairly complex. You will find it in the 
Summary of the Discrete Probability Functions Chapter, 


Glossary 


Hypergeometric Distribution 
A discrete random variable (RV) that is characterized by 


e A fixed number of trials. 
e The probability of success is not the same from trial to trial. 


We sample from two groups of items when we are interested in only 
one group. X is defined as the number of successes out of the total 
number of items chosen. Notation: X~H(r,b,n)., where r = the 


number of items in the group of interest, b = the number of items in the 
group not of interest, and nm = the number of items chosen. 


Poisson 

This module describes the characteristics of a Poisson experiment and the 
Poisson probability distribution. This module is included in the Elementary 
Statistics textbook/collection as an optional lesson. 


Characteristics of a Poisson experiment: 


1. The Poisson gives the probability of a number of events occurring in a 
fixed interval of time or space if these events happen with a known 
average rate and independently of the time since the last event. For 
example, a book editor might be interested in the number of words 
spelled incorrectly in a particular book. It might be that, on the 
average, there are 5 words spelled incorrectly in 100 pages. The 
interval is the 100 pages. 

2. The Poisson may be used to approximate the binomial if the 
probability of success is "small" (such as 0.01) and the number of trials 
is "large" (such as 1000). You will verify the relationship in the 
homework exercises. n is the number of trials and p is the probability 
of a "success." 


Poisson probability distribution. The random variable X = the number of 
occurrences in the interval of interest. The mean and variance are given in 
the summary. 


Example: 

The average number of loaves of bread put on a shelf in a bakery in a half- 
hour period is 12. Of interest is the number of loaves of bread put on the 
shelf in 5 minutes. The time interval of interest is 5 minutes. What is the 
probability that the number of loaves, selected randomly, put on the shelf 
in 5 minutes is 3? 

Let X = the number of loaves of bread put on the shelf in 5 minutes. If the 
average number of loaves put on the shelf in 30 minutes (half-hour) is 12, 
then the average number of loaves put on the shelf in 5 minutes is 
(=) - 12 = 2 loaves of bread 


The probability question asks you to find P(x = 3). 


Example: 

A certain bank expects to receive 6 bad checks per day, on average. What 
is the probability of the bank getting fewer than 5 bad checks on any given 
day? Of interest is the number of checks the bank receives in 1 day, so the 
time interval of interest is 1 day. Let X = the number of bad checks the 
bank receives in one day. If the bank expects to receive 6 bad checks per 
day then the average is 6 checks per day. The probability question asks for 
Jed acess) 


Example: 

You notice that a news reporter says "uh", on average, 2 times per 
broadcast. What is the probability that the news reporter says "uh" more 
than 2 times per broadcast. 

This is a Poisson problem because you are interested in knowing the 
number of times the news reporter says "uh" during a broadcast. 
Exercise: 


Problem: What is the interval of interest? 
Solution: 


One broadcast 
Exercise: 


Problem: 


What is the average number of times the news reporter says "uh" 
during one broadcast? 


Solution: 


2 


Exercise: 


Problem: Let X = . What values does X take on? 
Solution: 


Let X = the number of times the news reporter says "uh" during 
one broadcast. 
faa Va lee are eae 


Exercise: 


Problem: The probability question is P(____). 


Solution: 


Pixe2) 


Notation for the Poisson: P = Poisson Probability Distribution 
Function 


X~ P(u) 


Read this as "X is arandom variable with a Poisson distribution." The 
parameter is yz (or A). ps (or A) = the mean for the interval of interest. 


Example: 

Leah's answering machine receives about 6 telephone calls between 8 a.m. 
and 10 a.m. What is the probability that Leah receives more than 1 call in 

the next 15 minutes? 

Let X = the number of calls Leah receives in 15 minutes. (The interval of 
interest is 15 minutes or J. hour.) 

(bee 0 Della Bo Peas 


If Leah receives, on the average, 6 telephone calls in 2 hours, and there are 
eight 15 minutes intervals in 2 hours, then Leah receives 

= °6=0.75 

calls in 15 minutes, on the average. So, yz = 0.75 for this problem. 

D.C eo Vea((0). (659) 

Find P(x > 1). P(x > 1) = 0.1734 (calculator or computer) 

TI-83+ and TI-84: For a general discussion, see this example (Binomial). 
The syntax is similar. The Poisson parameter list is (jz for the interval of 
interest, number). For this problem: 

Press 1- and then press 2nd DISTR. Arrow down to C:poissoncdf. 
Press ENTER. Enter .75,1). The result is P(z > 1) = 0.1734. NOTE: 
The TI calculators use \ (lambda) for the mean. 

The probability that Leah receives more than 1 telephone call in the next 
fifteen minutes is about 0.1734. 

The graph of X ~ P(0.75) is: 


P(X=x) 


The y-axis contains the probability of z where X = the number of calls in 
15 minutes. 


Glossary 


Poisson Distribution 
A discrete random variable (RV) that counts the number of times a 
certain event will occur in a specific interval. Characteristics of the 
variable: 


e The probability that the event occurs in a given interval is the 
same for all intervals. 

e The events occur with a known mean and independently of the 
time since the last event. 


The distribution is defined by the mean yz of the event in the interval. 
Notation: X~ P(:). The mean is 4 = np. The standard deviation is 
o = ,/p. The probability of having exactly x successes in r trials is 


P(X = 2x) =e “4. The Poisson distribution is often used to 
approximate the binomial distribution when n is “large” and p is 
“small” (a general rule is that n should be greater than or equal to 20 


and p should be less than or equal to .05). 


Summary of Functions 

This module provides a review of the binomial, geometric, hypergeometric, 
and Poisson probability distribution functions and their properties. 
Formula 

Binomial 


X~B(n,p) 

X =the number of successes in n independent trials 
m = the number of independent trials 

X takes on the values z = 0,1, 2, 3, ...,n 

p = the probability of a success for any trial 

q = the probability of a failure for any trial 
Parga. GaLop 


The mean is 2 = np. The standard deviation is o = ,/npq. 


Formula 
Geometric 


X~G(p) 


X = the number of independent trials until the first success (count the 
failures and the first success) 


X takes on the values x= 1, 2, 3, ... 

p = the probability of a success for any trial 
q = the probability of a failure for any trial 
p+q=1 


q=1=p 


The mean is = : 


The standard deviation is o = > ( (+) — 1) 


Formula 
Hypergeometric 


X~H(r,b,n) 


X = the number of items from the group of interest that are in the chosen 
sample. 


X may take on the values x= 0, 1, ..., up to the size of the group of interest. 
(The minimum value for X may be larger than 0 in some instances.) 


r = the size of the group of interest (first group) 
b= the size of the second group 
n= the size of the chosen sample. 


n<r+b 


nr 


r+b 


The mean is: 2 = 


The standard deviation is: ¢ = eer a 
(r+b)°(r+b-1) 

Formula 

Poisson 

X ~ P(u) 


X =the number of occurrences in the interval of interest 
X takes on the values zx = 0, 1, 2, 3, ... 


The mean p is typically given. (A is often used as the mean instead of ju.) 
When the Poisson is used to approximate the binomial, we use the binomial 


mean ps = np. n is the binomial number of trials. p = the probability of a 
success for each trial. This formula is valid when n is "large" and p "small" 
(a general rule is that n should be greater than or equal to 20 and p should 
be less than or equal to 0.05). If n is large enough and p is small enough 
then the Poisson approximates the binomial very well. The variance is 

o? = pand the standard deviation is o = a/b 


Practice 1: Discrete Distribution 

This module provides students an opportunity to practice applying concepts 
related to discrete distributions. This practice exercise asks students to 
calculate several values based on the data provided. 


Student Learning Outcomes 


e The student will analyze the properties of a discrete distribution. 


Given: 


A ballet instructor is interested in knowing what percent of each year's class 
will continue on to the next, so that she can plan what classes to offer. Over 
the years, she has established the following probability distribution. 


e Let X =the number of years a student will study ballet with the 


teacher. 
e Let P(x) = the probability that a student will study ballet x years. 


Organize the Data 


Complete the table below using the data provided. 


x P(x) x*P(x) 
1 0.10 
2 0.05 


3 0.10 


x P(x) x*P(x) 


4 

5 0.30 

6 0.20 

Zz 0.10 
Exercise: 


Problem: In words, define the Random Variable X. 


Exercise: 


Problem: P(x = 4) = 


Exercise: 


Problem: P(x < 4) = 
Exercise: 


Problem: 
On average, how many years would you expect a child to study ballet 
with this teacher? 

Discussion Question 

Exercise: 


Problem: What does the column "P(x)" sum to and why? 


Exercise: 


Problem: What does the column "«*P(x)" sum to and why? 


Practice 2: Binomial Distribution 

This module provides a practice of Binomial Distribution as a part of 
Collaborative Statistics collection (col10522) by Barbara Illowsky and 
Susan Dean. 


Student Learning Outcomes 


e The student will construct the Binomial Distribution. 


Given 


The Higher Education Research Institute at UCLA collected data from 
203,967 incoming first-time, full-time freshmen from 270 four-year 
colleges and universities in the U.S. 71.3% of those students replied that, 
yes, they believe that same-sex couples should have the right to legal 
marital status. (Source: 
http://heri.ucla.edu/PDFs/pubs/TFS/Norms/Monographs/TheAmericanFres 
hman2011.pdf). ) 


Suppose that you randomly pick 8 first-time, full-time freshmen from the 


survey. You are interested in the number that believes that same sex-couples 
should have the right to legal marital status 


Interpret the Data 


Exercise: 


Problem: In words, define the random Variable X. 


Solution: 


X= the number that reply “yes” 


Exercise: 


Problem: X~ 


Solution: 
B(8,0.713) 


Exercise: 


Problem: What values does the random variable X take on? 


Solution: 
0,1,2.5:4:5;6;7,8 


Exercise: 


Problem: Construct the probability distribution function (PDF). 


Exercise: 


Problem: 


On average (u), how many would you expect to answer yes? 


Solution: 


D7 


Exercise: 


Problem: What is the standard deviation (c) ? 


Solution: 


1.28 
Exercise: 


Problem: 


What is the probability that at most 5 of the freshmen reply “yes”? 


Solution: 


0.4151 
Exercise: 


Problem: 
What is the probability that at least 2 of the freshmen reply “yes”? 
Solution: 


0.9990 
Exercise: 


Problem: 


Construct a histogram or plot a line graph. Label the horizontal and 
vertical axes with words. Include numerical scaling. 


Practice 3: Poisson Distribution 
This module provides further practices and exercises on Poisson 
Distribution in statistics. 


Student Learning Outcomes 


e The student will analyze the properties of a Poisson distribution. 


Given 


On average, eight teens in the U.S. die from motor vehicle injuries per day. 
As a result, states across the country are debating raising the driving age. 
(Source: 
http://www.cdc.gov/Motorvehiclesafety/Teen_Drivers/teendrivers_factsheet. 
html) ) 


Interpret the Data 


Exercise: 


Problem: 


Assume the event occurs independently in any given day. In words, 
define the Random Variable 


Exercise: 


Problem: ~ 


Solution: 


P(8) 


Exercise: 


Problem: What values does take on? 


Solution: 


O12 BA. 
Exercise: 


Problem: 


For the given values of the random variable __, fill in the 
corresponding probabilities. 


Exercise: 


Problem: 


Is it likely that there will be no teens killed in the U.S. from motor 
vehicle injuries on any given day? Justify your answer numerically. 


Solution: 


No 
Exercise: 


Problem: 


Is it likely that there will be more than 20 teens killed in the U.S. from 
motor vehicle injuries on any given day? Justify your answer 
numerically. 


Solution: 


No 


Practice 4: Geometric Distribution 
This module provides further practice with topics of Geometric Distribution 
in Statistics. 


Student Learning Outcomes 


e The student will analyze the properties of a geometric distribution. 


Given: 
Use the information from the Binomial Distribution Practice shown below. 


The Higher Education Research Institute at UCLA collected data from 
203,967 incoming first-time, full-time freshmen from 270 four-year 
colleges and universities in the U.S. 71.3% of those students replied that, 
yes, they believe that same-sex couples should have the right to legal 
marital status. (Source: 
http://heri.ucla.edu/PDFs/pubs/TFS/Norms/Monographs/TheAmericanFres 
hman2011.pdf) 


Suppose that you randomly select freshman from the study until you find 


one who replies “yes.” You are interested in the number of freshmen you 
must ask. 


Interpret the Data 


Exercise: 


Problem: In words, define the Random Variable X. 


Exercise: 


Problem: X ~ 


Solution: 


G(0.713) 


Exercise: 


Problem: What values does the random variable X take on? 


Solution: 


12.2: 
Exercise: 


Problem: 


Construct the probability distribution function (PDF). Stop at x 


Exercise: 


Problem: 


On average(j), how many freshmen would you expect to have to ask 
until you found one who replies "yes?" 


Solution: 


1 
Exercise: 


Problem: 


What is the probability that you will need to ask fewer than 3 
freshmen? 


Solution: 


0.9176 
Exercise: 


Problem: 


Construct a histogram or plot a line graph. Label the horizontal and 
vertical axes with words. Include numerical scaling. 


Practice 5: Hypergeometric Distribution 
This module provides further practice and exercises on Hypergeometric 
Distribution in statistics 


Student Learning Outcomes 


e The student will analyze the properties of a hypergeometric 
distribution. 


Given 
Suppose that a group of statistics students is divided into two groups: 
business majors and non-business majors. There are 16 business majors in 


the group and 7 non-business majors in the group. A random sample of 9 
students is taken. We are interested in the number of business majors in the 


group. 
Interpret the Data 


Exercise: 


Problem: In words, define the Random Variable X. 


Exercise: 


Problem: X ~ 
Solution: 


H(16,7,9) 


Exercise: 


Problem: What values does X take on? 


Solution: 


2,3,4,5,6,7,8,9 
Exercise: 


Problem: 


Construct the probability distribution function (PDF) for X. 


Exercise: 


Problem: 
On average(j), how many would you expect to be business majors? 
Solution: 


6.26 


Homework 

Discrete Random Variables: Homework is part of the collection col10555 
written by Barbara Illowsky and Susan Dean Homework and provides a 
number of homework exercises related to Discrete Random Variables 
(binomial, geometric, hypergeometric and Poisson) with contributions from 
Roberta Bloom. 

Exercise: 


Problem: 1. Complete the PDF and answer the questions. 


£ P(X = 2) POX a) 
0 0.3 

1 0.2 

Z 

3 0.4 


e aFind the probability that z = 2. 
e b Find the expected value. 


Solution: 


e a0.1 
e b1.6 


Exercise: 


Problem: 


Suppose that you are offered the following “deal.” You roll a die. If 
you roll a 6, you win $10. If you roll a 4 or 5, you win $5. If you roll a 
1, 2, or 3, you pay $6. 


e¢ aWhat are you ultimately interested in here (the value of the roll 
or the money you win)? 

e bln words, define the Random Variable X. 

e cList the values that X may take on. 

e dConstruct a PDF. 

e eOver the long run of playing this game, what are your expected 
average winnings per game? 

e fBased on numerical values, should you take the deal? Explain 
your decision in complete sentences. 


Exercise: 


Problem: 


A venture capitalist, willing to invest $1,000,000, has three 
investments to choose from. The first investment, a software company, 
has a 10% chance of returning $5,000,000 profit, a 30% chance of 
returning $1,000,000 profit, and a 60% chance of losing the million 
dollars. The second company, a hardware company, has a 20% chance 
of returning $3,000,000 profit, a 40% chance of returning $1,000,000 
profit, and a 40% chance of losing the million dollars. The third 
company, a biotech firm, has a 10% chance of returning $6,000,000 
profit, a 70% of no profit or loss, and a 20% chance of losing the 
million dollars. 


aConstruct a PDF for each investment. 

bFind the expected value for each investment. 

cWhich is the safest investment? Why do you think so? 
dWhich is the riskiest investment? Why do you think so? 
eWhich investment has the highest expected return, on average? 


Solution: 


¢ b$200,000;$600,000;$400,000 
e cthird investment 

e dfirst investment 

e esecond investment 


Exercise: 


Problem: 


A theater group holds a fund-raiser. It sells 100 raffle tickets for $5 
apiece. Suppose you purchase 4 tickets. The prize is 2 passes to a 
Broadway show, worth a total of $150. 


e aWhat are you interested in here? 

e bln words, define the Random Variable X. 

e cList the values that X may take on. 

e dConstruct a PDF. 

e elf this fund-raiser is repeated often and you always purchase 4 
tickets, what would be your expected average winnings per raffle? 


Exercise: 


Problem: 


Suppose that 20,000 married adults in the United States were randomly 
surveyed as to the number of children they have. The results are 
compiled and are used as theoretical probabilities. Let X = the number 
of children 


0 0.10 
1 0.20 
2 0.30 
3 

4 0.10 
5 0.05 
6 (or more) 0.05 


aFind the probability that a married adult has 3 children. 

bIn words, what does the expected value in this example 
represent? 

c Find the expected value. 

d Is it more likely that a married adult will have 2 — 3 children or 
4 —6 children? How do you know? 


Solution: 


a0.2 
€2:35 
d2-3 children 


Exercise: 


Problem: 


Suppose that the PDF for the number of years it takes to earn a 
Bachelor of Science (B.S.) degree is given below. 


3 0.05 
4 0.40 
is) 0.30 
6 0.15 
7 0.10 


e aln words, define the Random Variable X. 

e b What does it mean that the values 0, 1, and 2 are not included 
for x in the PDF? 

e cOn average, how many years do you expect it to take for an 
individual to earn a B.S.? 


For each problem: 


e aln words, define the Random Variable X. 
¢ bList the values that X may take on. 
e cGive the distribution of X. X~ 


Then, answer the questions specific to each individual problem. 
Exercise: 
Problem: 


Six different colored dice are rolled. Of interest is the number of dice 
that show a “1.” 


e dOn average, how many dice would you expect to show a “1”? 
e eFind the probability that all six dice show a “1.” 


e fls it more likely that 3 or that 4 dice will show a “1”? Use 
numbers to justify your answer numerically. 


Solution: 


e aX =the number of dice that show a 1 
¢ b0,1,2,3,4,5,6 

¢ c X~B(6, =) 

edi 

e e 0.00002 

e f3 dice 


Exercise: 


Problem: 


More than 96 percent of the very largest colleges and universities 
(more than 15,000 total enrollments) have some online offerings. 
Suppose you randomly pick 13 such institutions. We are interested in 
the number that offer distance learning courses. (Source: 
http://en.wikipedia.org/wiki/Distance_education) 


e dOn average, how many schools would you expect to offer such 
courses? 

e eFind the probability that at most 6 offer such courses. 

e fls it more likely that 0 or that 13 will offer such courses? Use 
numbers to justify your answer numerically and answer in a 
complete sentence. 


Exercise: 


Problem: 


A school newspaper reporter decides to randomly survey 12 students 
to see if they will attend Tet (Vietnamese New Year) festivities this 
year. Based on past years, she knows that 18% of students attend Tet 
festivities. We are interested in the number of students who will attend 
the festivities. 


e dHow many of the 12 students do we expect to attend the 
festivities? 

e eFind the probability that at most 4 students will attend. 

e fFind the probability that more than 2 students will attend. 


Solution: 


e a X =the number of students that will attend Tet. 
s- DO; 12. 3..4.°5.67 7,8, 9. 10: 1112 

e c X~B(12,0.18) 

e d2.16 

e e0.9511 

e £0.3702 


Exercise: 
Problem: 


Suppose that about 85% of graduating students attend their graduation. 
A group of 22 graduating students is randomly chosen. 


e dHow many are expected to attend their graduation? 

e eFind the probability that 17 or 18 attend. 

e fBased on numerical values, would you be surprised if all 22 
attended graduation? Justify your answer numerically. 


Exercise: 


Problem: 


At The Fencing Center, 60% of the fencers use the foil as their main 
weapon. We randomly survey 25 fencers at The Fencing Center. We 
are interested in the numbers that do not use the foil as their main 
weapon. 


e dHow many are expected to not use the foil as their main 
weapon? 

e eFind the probability that six do not use the foil as their main 
weapon. 

e fBased on numerical values, would you be surprised if all 25 did 
not use foil as their main weapon? Justify your answer 
numerically. 


Solution: 


e aX =the number of fencers that do not use foil as their main 
weapon 

e bO, 1, 2, 3,... 25 

¢ c X~B(25,0.40) 

e di0 

e e0.0442 

e fYes 


Exercise: 


Problem: 


Approximately 8% of students at a local high school participate in 
after-school sports all four years of high school. A group of 60 seniors 
is randomly chosen. Of interest is the number that participated in after- 
school sports all four years of high school. 


e dHow many seniors are expected to have participated in after- 
school sports all four years of high school? 


e eBased on numerical values, would you be surprised if none of 
the seniors participated in after-school sports all four years of 
high school? Justify your answer numerically. 

e fBased upon numerical values, is it more likely that 4 or that 5 of 
the seniors participated in after-school sports all four years of 
high school? Justify your answer numerically. 


Exercise: 


Problem: 


The chance of having an extra fortune in a fortune cookie is about 3%. 
Given a bag of 144 fortune cookies, we are interested in the number of 
cookies with an extra fortune. Two distributions may be used to solve 
this problem. Use one distribution to solve the problem. 


e dHow many cookies do we expect to have an extra fortune? 

e eFind the probability that none of the cookies have an extra 
fortune. 

e fFind the probability that more than 3 have an extra fortune. 

e gAs n increases, what happens involving the probabilities using 
the two distributions? Explain in complete sentences. 


Solution: 


e aX =the number of fortune cookies that have an extra fortune 
e bO, 1, 2, 3,... 144 

¢ c X~B(144, 0.03) or P(4.32) 

e d4.32 

e e 0.0124 or 0.0133 

e f 0.6300 or 0.6264 


Exercise: 


Problem: 


There are two games played for Chinese New Year and Vietnamese 
New Year. They are almost identical. In the Chinese version, fair dice 
with numbers 1, 2, 3, 4, 5, and 6 are used, along with a board with 
those numbers. In the Vietnamese version, fair dice with pictures of a 
gourd, fish, rooster, crab, crayfish, and deer are used. The board has 
those six objects on it, also. We will play with bets being $1. The 
player places a bet on a number or object. The “house” rolls three dice. 
If none of the dice show the number or object that was bet, the house 
keeps the $1 bet. If one of the dice shows the number or object bet 
(and the other two do not show it), the player gets back his $1 bet, plus 
$1 profit. If two of the dice show the number or object bet (and the 
third die does not show it), the player gets back his $1 bet, plus $2 
profit. If all three dice show the number or object bet, the player gets 
back his $1 bet, plus $3 profit. 


Let X = number of matches and Y= profit per game. 


e dList the values that Y may take on. Then, construct one PDF 
table that includes both X & Y and their probabilities. 

e eCalculate the average expected matches over the long run of 
playing this game for the player. 

e fCalculate the average expected earnings over the long run of 
playing this game for the player. 

e gDetermine who has the advantage, the player or the house. 


Exercise: 


Problem: 


According to the South Carolina Department of Mental Health web 
site, for every 200 U.S. women, the average number who suffer from 
anorexia is one (http:/Awww.state.sc.us/dmh/anorexia/statistics.htm). 
Out of a randomly chosen group of 600 U.S. women: 


¢ dHow many are expected to suffer from anorexia? 
e eFind the probability that no one suffers from anorexia. 


e fFind the probability that more than four suffer from anorexia. 


Solution: 


e a X =the number of women that suffer from anorexia 
e bO, 1, 2, 3,... 600 (can leave off 600) 

e c X~P(3) 

e d3 

e e0.0498 

e £0.1847 


Exercise: 


Problem: 


The average number of children a Japanese woman has in her lifetime 
is 1.37. Suppose that one Japanese woman is randomly chosen. 
(http://www.mhlw.go.jp/english/policy/children/children- 
childrearing/index.html MHLW’s Pamphlet) 


e dFind the probability that she has no children. 

e eFind the probability that she has fewer children than the 
Japanese average. 

e fFind the probability that she has more children than the Japanese 
average. 


Exercise: 


Problem: 


The average number of children a Spanish woman has in her lifetime 
is 1.47. Suppose that one Spanish woman is randomly chosen. 
(http://www.typicallyspanish.com/news/publish/article 4897.shtml). 


e dFind the probability that she has no children. 
e eFind the probability that she has fewer children than the Spanish 
average. 


e fFind the probability that she has more children than the Spanish 
average . 


Solution: 


e a X =the number of children for a Spanish woman 
¢ bo, 1, 2, 3.... 

e c X~P(1.47) 

e d0.2299 

° e0.5679 

e 0.4321 


Exercise: 


Problem: 


Fertile (female) cats produce an average of 3 litters per year. (Source: 
The Humane Society of the United States). Suppose that one fertile, 
female cat is randomly chosen. In one year, find the probability she 
produces: 


e dNo litters. 
e eAt least 2 litters. 
e fExactly 3 litters. 


Exercise: 


Problem: 


A consumer looking to buy a used red Miata car will call dealerships 
until she finds a dealership that carries the car. She estimates the 
probability that any independent dealership will have the car will be 
28%. We are interested in the number of dealerships she must call. 


e dOn average, how many dealerships would we expect her to have 
to call until she finds one that has the car? 
e eFind the probability that she must call at most 4 dealerships. 


e fFind the probability that she must call 3 or 4 dealerships. 


Solution: 


e a X =the number of dealers she calls until she finds one with a 
used red Miata 

e bi, 2, 3... 

e c X~G(0.28) 

e d3.57 

e e0.7313 

e £0.2497 


Exercise: 


Problem: 


Suppose that the probability that an adult in America will watch the 
Super Bowl is 40%. Each person is considered independent. We are 
interested in the number of adults in America we must survey until we 
find one who will watch the Super Bowl. 


e dHow many adults in America do you expect to survey until you 
find one who will watch the Super Bowl? 

e eFind the probability that you must ask 7 people. 

e fFind the probability that you must ask 3 or 4 people. 


Exercise: 


Problem: 


A group of Martial Arts students is planning on participating in an 
upcoming demonstration. 6 are students of Tae Kwon Do; 7 are 
students of Shotokan Karate. Suppose that 8 students are randomly 
picked to be in the first demonstration. We are interested in the number 
of Shotokan Karate students in that first demonstration. Hint: Use the 
Hypergeometric distribution. Look in the Formulas section of 4: 
Discrete Distributions and in the Appendix Formulas. 


e dHow many Shotokan Karate students do we expect to be in that 
first demonstration? 

e eFind the probability that 4 students of Shotokan Karate are 
picked for the first demonstration. 

e fSuppose that we are interested in the Tae Kwan Do students that 
are picked for the first demonstration. Find the probability that all 
6 students of Tae Kwan Do are picked for the first demonstration. 


Solution: 


e d4.31 
e e0.4079 
¢ £0.0163 


Exercise: 


Problem: 


The chance of a IRS audit for a tax return with over $25,000 in income 
is about 2% per year. We are interested in the expected number of 
audits a person with that income has in a 20 year period. Assume each 
year is independent. 


e dHow many audits are expected in a 20 year period? 
e eFind the probability that a person is not audited at all. 
e fFind the probability that a person is audited more than twice. 


Exercise: 


Problem: 


Refer to the previous problem. Suppose that 100 people with tax 
returns over $25,000 are randomly picked. We are interested in the 
number of people audited in 1 year. One way to solve this problem is 
by using the Binomial Distribution. Since n is large and p is small, 
another discrete distribution could be used to solve the following 
problems. Solve the following questions (d-f) using that distribution. 


e dHow many are expected to be audited? 
e eFind the probability that no one was audited. 
e fFind the probability that more than 2 were audited. 


Solution: 


e d2 
e e0.1353 
¢ £0.3233 


Exercise: 


Problem: 


Suppose that a technology task force is being formed to study 
technology awareness among instructors. Assume that 10 people will 
be randomly chosen to be on the committee from a group of 28 
volunteers, 20 who are technically proficient and 8 who are not. We 
are interested in the number on the committee who are not technically 
proficient. 


e dHow many instructors do you expect on the committee who are 
not technically proficient? 

e eFind the probability that at least 5 on the committee are not 
technically proficient. 

e fFind the probability that at most 3 on the committee are not 
technically proficient. 


Exercise: 
Problem: 


Refer back to Exercise 4.15.12. Solve this problem again, using a 
different, though still acceptable, distribution. 


Solution: 


e a X =the number of seniors that participated in after-school 
sports all 4 years of high school 

e b0, 1, 2, 3,... 60 

e c X~P(4.8) 

e d4.8 

e eYes 

e £4 


Exercise: 


Problem: 


Suppose that 9 Massachusetts athletes are scheduled to appear at a 
charity benefit. The 9 are randomly chosen from 8 volunteers from the 
Boston Celtics and 4 volunteers from the New England Patriots. We 
are interested in the number of Patriots picked. 


e dis it more likely that there will be 2 Patriots or 3 Patriots picked? 


Exercise: 


Problem: 


On average, Pierre, an amateur chef, drops 3 pieces of egg shell into 
every 2 batters of cake he makes. Suppose that you buy one of his 
cakes. 


e dOn average, how many pieces of egg shell do you expect to be 
in the cake? 

e eWhat is the probability that there will not be any pieces of egg 
shell in the cake? 

e fLet’s say that you buy one of Pierre’s cakes each week for 6 
weeks. What is the probability that there will not be any egg shell 
in any of the cakes? 

e gBased upon the average given for Pierre, is it possible for there 
to be 7 pieces of shell in the cake? Why? 


Solution: 


a X = the number of shell pieces in one cake 
bO, 1, 2, 3.... 

c X~P(1.5) 

di.5 

e0.2231 

£0.0001 

gYes 


Exercise: 


Problem: 


It has been estimated that only about 30% of California residents have 
adequate earthquake supplies. Suppose we are interested in the number 
of California residents we must survey until we find a resident who 
does not have adequate earthquake supplies. 


dWhat is the probability that we must survey just 1 or 2 residents 
until we find a California resident who does not have adequate 
earthquake supplies? 

eWhat is the probability that we must survey at least 3 California 
residents until we find a California resident who does not have 
adequate earthquake supplies? 

fHow many California residents do you expect to need to survey 
until you find a California resident who does not have adequate 
earthquake supplies? 

gHow many California residents do you expect to need to survey 
until you find a California resident who does have adequate 
earthquake supplies? 


Exercise: 


Problem: 


Refer to the above problem. Suppose you randomly survey 11 
California residents. We are interested in the number who have 
adequate earthquake supplies. 


e d What is the probability that at least 8 have adequate earthquake 
supplies? 

e els it more likely that none or that all of the residents surveyed 
will have adequate earthquake supplies? Why? 

e fHow many residents do you expect will have adequate 
earthquake supplies? 


Solution: 


e d0.0043 
e enone 
* 433 


The next 2 questions refer to the following: In one of its Spring catalogs, 
L.L. Bean® advertised footwear on 29 of its 192 catalog pages. 
Exercise: 


Problem: 


Suppose we randomly survey 20 pages. We are interested in the 
number of pages that advertise footwear. Each page may be picked at 
most once. 


e dHow many pages do you expect to advertise footwear on them? 

¢ els it probable that all 20 will advertise footwear on them? Why 
or why not? 

e fWhat is the probability that less than 10 will advertise footwear 
on them? 


Exercise: 


Problem: 


Suppose we randomly survey 20 pages. We are interested in the 
number of pages that advertise footwear. This time, each page may be 
picked more than once. 


dHow many pages do you expect to advertise footwear on them? 
els it probable that all 20 will advertise footwear on them? Why 
or why not? 

f What is the probability that less than 10 will advertise footwear 
on them? 

gReminder: A page may be picked more than once. We are 
interested in the number of pages that we must randomly survey 
until we find one that has footwear advertised on it. Define the 
random variable X and give its distribution. 

hWhat is the probability that you only need to survey at most 3 
pages in order to find one that advertises footwear on it? 

iHow many pages do you expect to need to survey in order to find 
one that advertises footwear? 


Solution: 


d3.02 

eNo 

£0.9997 
h0.3881 
16.6207 pages 


Exercise: 


Problem: 


Suppose that you roll a fair die until each face has appeared at least 
once. It does not matter in what order the numbers appear. Find the 
expected number of rolls you must make until each face has appeared 
at least once. 


Try these multiple choice problems. 


For the next three problems: The probability that the San Jose Sharks will 
win any given game is 0.3694 based on a 13 year win history of 382 wins 
out of 1034 games played (as of a certain date). An upcoming monthly 
schedule contains 12 games. 

Let X = the number of games won in that upcoming month. 

Exercise: 


Problem: The expected number of wins for that upcoming month is: 


¢ A1.67 


° Bl2 


382 
° C iB 


e D443 


Solution: 


D: 4.43 
Exercise: 


Problem: 


What is the probability that the San Jose Sharks win 6 games in that 
upcoming month? 


e A0.1476 
e B0.2336 
e C0.7664 
e DO0.8903 


Solution: 


A: 0.1476 


Exercise: 


Problem: 


What is the probability that the San Jose Sharks win at least 5 games in 
that upcoming month 


e A0.3694 
e B0.5266 
e C0.4734 
e DO.2305 


Solution: 


C: 0.4734 


For the next two questions: The average number of times per week that 
Mrs. Plum’s cats wake her up at night because they want to play is 10. We 
are interested in the number of times her cats wake her up each week. 
Exercise: 


Problem: In words, the random variable X = 


e A The number of times Mrs. Plum’s cats wake her up each week 
¢ B The number of times Mrs. Plum’s cats wake her up each hour 

e C The number of times Mrs. Plum’s cats wake her up each night 
¢ D The number of times Mrs. Plum’s cats wake her up 


Solution: 


A: The number of times Mrs. Plum's cats wake her up each week 
Exercise: 
Problem: 


Find the probability that her cats will wake her up no more than 5 
times next week. 


e A0.5000 
e B0.9329 
e €0.0378 
e DO0.0671 


Solution: 


D: 0.0671 
Exercise: 


Problem: 


People visiting video rental stores often rent more than one DVD at a 
time. The probability distribution for DVD rentals per customer at 
Video To Go is given below. There is 5 video limit per customer at this 
store, so nobody ever rents more than 5 DVDs. 


x 0 1 2 3 4 fs) 


P(X=x) 0.03 O50 0.24 #? 0.07  & 0.04 


e A Describe the random variable X in words. 

e B Find the probability that a customer rents three DVDs. 

e C Find the probability that a customer rents at least 4 DVDs. 
e D Find the probability that a customer rents at most 2 DVDs. 


Another shop, Entertainment Headquarters, rents DVDs and 
videogames. The probability distribution for DVD rentals per customer 
at this shop is given below. They also have a5 DVD limit per 
customer. 


x 0 1 2 3 4 fs) 


P(X=x) 035 0.25 0.20 010 0.05 0.05 


e E At which store is the expected number of DVDs rented per 
customer higher? 

e F If Video to Go estimates that they will have 300 customers next 
week, how many DVDs do they expect to rent next week? 
Answer in sentence form. 

¢ GIf Video to Go expects 300 customers next week and 
Entertainment HQ projects that they will have 420 customers, for 
which store is the expected number of DVD rentals for next week 
higher? Explain. 

e H Which of the two video stores experiences more variation in 
the number of DVD rentals per customer? How do you know 
that? 


Solution: 


Partial Answer: 

A: X = the number of DVDs a Video to Go customer rents 
B: 0.12 

C: 0.11 

D: 0.77 


Exercise: 
Problem: 
A game involves selecting a card from a deck of cards and tossing a 
coin. The deck has 52 cards and 12 cards are "face cards" (Jack, 


Queen, or King) The coin is a fair coin and is equally likely to land on 
Heads or Tails 


e If the card is a face card and the coin lands on Heads, you win $6 
e If the card is a face card and the coin lands on Tails, you win $2 


e If the card is not a face card, you lose $2, no matter what the coin 
shows. 


e A Find the expected value for this game (expected net gain or 
loss). 

¢ B Explain what your calculations indicate about your long-term 
average profits and losses on this game. 

e¢ C Should you play this game to win money? 


Solution: 


The variable of interest is X = net gain or loss, in dollars 


The face cards J, Q, K (Jack, Queen, King). There are(3)(4) = 12 face 
cards and 52 — 12 = 40 cards that are not face cards. 


We first need to construct the probability distribution for X. We use the 
card and coin events to determine the probability for each outcome, 
but we use the monetary value of X to determine the expected value. 


$X net gain or 


Card Event gee P(X) 
(12/52)(1/2) = 
Face Card and Heads 6 6/52 
(12/52)(1/2) = 
Face Card and Tails 2 6/52 
(Not Face Card) and (H -) (40/52)(1) = 


or T) 40/52 


e Expected value = (6)(6/52) + (2)(6/52) + (—2) (40/52) = —32/52 

e Expected value = —$0.62, rounded to the nearest cent 

e If you play this game repeatedly, over a long number of games, 
you would expect to lost 62 cents per game, on average. 

e You should not play this game to win money because the 
expected value indicates an expected average loss. 


Exercise: 
Problem: 
You buy a lottery ticket to a lottery that costs $10 per ticket. There are 
only 100 tickets available be sold in this lottery. In this lottery there is 


one $500 prize, 2 $100 prizes and 4 $25 prizes. Find your expected 
gain or loss. 


Solution: 


Start by writing the probability distribution. X is net gain or loss = 
prize (if any) less $10 cost of ticket 


X = $ net gain or loss P(X) 

$500-—$10=$490 1/100 
$100—$10=$90 2/100 
$25-$10=$15 4/100 


$0-$10=$-10 93/100) 


Expected Value = (490)(1/100) + (90)(2/100) + (15)(4/100) + (—10) 
(93/100) = —$2. There is an expected loss of $2 per ticket, on average. 


Exercise: 


Problem: 


A student takes a 10 question true-false quiz, but did not study and 
randomly guesses each answer. Find the probability that the student 
passes the quiz with a grade of at least 70% of the questions correct. 


Solution: 


e X =number of questions answered correctly 

e X~B(10, 0.5) 

e We are interested in AT LEAST 70% of 10 questions correct. 
70% of 10 is 7. We want to find the probability that X is greater 
than or equal to 7. The event "at least 7" is the complement of 
"less than or equal to 6". 

e Using your calculator's distribution menu: 1 — binomcdf(10, .5, 6) 
gives 0.171875 

e The probability of getting at least 70% of the 10 questions correct 
when randomly guessing is approximately 0.172 


Exercise: 


Problem: 


A student takes a 32 question multiple choice exam, but did not study 
and randomly guesses each answer. Each question has 3 possible 
choices for the answer. Find the probability that the student guesses 
more than 75% of the questions correctly. 


Solution: 


e X =number of questions answered correctly 

e X~B(82, 1/3) 

e We are interested in MORE THAN 75% of 32 questions correct. 
75% of 32 is 24. We want to find P(x>24). The event "more than 


24" is the complement of "less than or equal to 24". 

e Using your calculator's distribution menu: 1 - binomcdf(32, 1/3, 
24) 

e P(x>24) = 0.00000026761 

e The probability of getting more than 75% of the 32 questions 
correct when randomly guessing is very small and practically 
zero. 


Exercise: 


Problem: 


Suppose that you are perfoming the probability experiment of rolling 
one fair six-sided die. Let F be the event of rolling a"4" or a"5". You 
are interested in how many times you need to roll the die in order to 
obtain the first “4 or 5” as the outcome. 


e p= probability of success (event F occurs) 
e q = probability of failure (event F does not occur) 


e A Write the description of the random variable X. What are the 
values that X can take on? Find the values of p and q. 

e B Find the probability that the first occurrence of event F (rolling 
a “4” or “5”) is on the second trial. 

e C How many trials would you expect until you roll a “4” or “5”? 


Solution: 


A: X can take on the values 1, 2, 3, .... p = 2/6, gq = 4/6 
B20.2222 
Crees} 


**Exercises 38 - 43 contributed by Roberta Bloom 


Review 
This module provides a number of homework/review exercises 
summarizing topics related to Discrete Random Variables. 


The next two questions refer to the following: 


A recent poll concerning credit cards found that 35 percent of respondents 
use a credit card that gives them a mile of air travel for every dollar they 
charge. Thirty percent of the respondents charge more than $2000 per 
month. Of those respondents who charge more than $2000, 80 percent use a 
credit card that gives them a mile of air travel for every dollar they charge. 
Exercise: 


Problem: 


What is the probability that a randomly selected respondent will spend 
more than $2000 AND use a credit card that gives them a mile of air 
travel for every dollar they charge? 


(0.30) (0.35) 
(0.80)(0.35) 
(0.80) 
(0.80) 


0.80)(0.30) 
0.80 


A 
B 
C 
D 


Solution: 


C 
Exercise: 
Problem: 
Based upon the above information, are using a credit card that gives a 


mile of air travel for each dollar spent AND charging more than $2000 
per month independent events? 


e AYes 
e BNo, and they are not mutually exclusive either 
e CNo, but they are mutually exclusive 


¢ DNot enough information given to determine the answer 


Solution: 


B 
Exercise: 


Problem: 


A sociologist wants to know the opinions of employed adult women 
about government funding for day care. She obtains a list of 520 
members of a local business and professional women’s club and mails 
a questionnaire to 100 of these women selected at random. 68 
questionnaires are returned. What is the population in this study? 


e AAIll employed adult women 

e BAI] the members of a local business and professional women’s 
club 

e CThe 100 women who received the questionnaire 

e DAI! employed women with children 


Solution: 


A 


The next two questions refer to the following: An article from The San Jose 
Mercury News was concerned with the racial mix of the 1500 students at 
Prospect High School in Saratoga, CA. The table summarizes the results. 
(Male and female values are approximate.) Suppose one Prospect High 
School student is randomly selected. 


Ethnic 
Group 


Gender White Asian Hispanic Black 


Male 400 168 115 35 
Female 440 132 140 40 
Exercise: 


American 
Indian 


16 


14 


Problem: Find the probability that a student is Asian or Male. 


Solution: 


0.5773 
Exercise: 


Problem: 


Find the probability that a student is Black given that the student is 


Female. 


Solution: 


0.0522 
Exercise: 


Problem: 


A sample of pounds lost, in a certain month, by individual members of 


a weight reducing clinic produced the following statistics: 


e Mean = 5 lbs. 
e Median = 4.5 lbs. 
e Mode = 4 lbs. 


e Standard deviation = 3.8 lbs. 
e First quartile = 2 lbs. 
e Third quartile = 8.5 lbs. 


The correct statement is: 


e AOne fourth of the members lost exactly 2 pounds. 

e BThe middle fifty percent of the members lost from 2 to 8.5 lbs. 
¢ CMost people lost 3.5 to 4.5 Ibs. 

e¢ DAI! of the choices above are correct. 


Solution: 


B 
Exercise: 
Problem: 


What does it mean when a data set has a standard deviation equal to 
Zero? 


e AAIl values of the data appear with the same frequency. 
e BThe mean of the data is also zero. 

CAIl of the data have the same value. 

e DtThere are no data to begin with. 


Solution: 


C 


Exercise: 


Problem: The statement that best describes the illustration below is: 


e AThe mean is equal to the median. 
¢ BThere is no first quartile. 


e CThe lowest data value is the median. 


e DThe median equals (OF @8) 


Solution: 


C 

Exercise: 
Problem: 
According to a recent article (San Jose Mercury News) the average 
number of babies born with significant hearing loss (deafness) is 
approximately 2 per 1000 babies in a healthy baby nursery. The 


number climbs to an average of 30 per 1000 babies in an intensive care 
nursery. 


Suppose that 1000 babies from healthy baby nurseries were randomly 
surveyed. Find the probability that exactly 2 babies were born deaf. 


Solution: 


0.2709 
Exercise: 
Problem: 
A “friend” offers you the following “deal.” For a $10 fee, you may 


pick an envelope from a box containing 100 seemingly identical 
envelopes. However, each envelope contains a coupon for a free gift. 


¢ 10 of the coupons are for a free gift worth $6. 
¢ 80 of the coupons are for a free gift worth $8. 
¢ 6 of the coupons are for a free gift worth $12. 
e 4 of the coupons are for a free gift worth $40. 


Based upon the financial gain or loss over the long run, should you 
play the game? 


e AYes, I expect to come out ahead in money. 
e BNo, I expect to come out behind in money. 
e ClIt doesn’t matter. I expect to break even. 


Solution: 


B 


The next four questions refer to the following: Recently, a nurse 
commented that when a patient calls the medical advice line claiming to 
have the flu, the chance that he/she truly has the flu (and not just a nasty 
cold) is only about 4%. Of the next 25 patients calling in claiming to have 
the flu, we are interested in how many actually have the flu. 

Exercise: 


Problem: Define the Random Variable and list its possible values. 


Solution: 


X = the number of patients calling in claiming to have the flu, who 
actually have the flu. X = 0, 1, 2, ...25 


Exercise: 


Problem: State the distribution of X . 


Solution: 


B(25,0.04) 
Exercise: 


Problem: 


Find the probability that at least 4 of the 25 patients actually have the 
flu. 


Solution: 


0.0165 
Exercise: 
Problem: 


On average, for every 25 patients calling in, how many do you expect 
to have the flu? 


Solution: 


if 


The next two questions refer to the following: Different types of writing can 
sometimes be distinguished by the number of letters in the words used. A 
student interested in this fact wants to study the number of letters of words 
used by Tom Clancy in his novels. She opens a Clancy novel at random and 
records the number of letters of the first 250 words on the page. 

Exercise: 


Problem: What kind of data was collected? 
e Aqualitative 


e Bquantitative - continuous 
e Cquantitative — discrete 


Solution: 


C 


Exercise: 


Problem: What is the population under study? 
Solution: 


All words used by Tom Clancy in his novels 


Lab 1: Discrete Distribution (Playing Card Experiment) 

This module allows students to explore concepts related to discrete random 
variables through the use of a simple playing card experiment. Students will 
compare empirical data to a theoretical distribution to determine if the 
experiment fist a discrete distribution. This lab involves the concept of 
long-term probabilities. 


Class Time: 


Names: 


Student Learning Outcomes: 


e The student will compare empirical data and a theoretical distribution 
to determine if everyday experiment fits a discrete distribution. 

e The student will demonstrate an understanding of long-term 
probabilities. 


Supplies: 


e One full deck of playing cards 


Procedure 
The experiment procedure is to pick one card from a deck of shuffled cards. 


1. The theorectical probability of picking a diamond from a deck is: 

2. Shuffle a deck of cards. 

3. Pick one card from it. 

4. Record whether it was a diamond or not a diamond. 

5. Put the card back and reshuffle. 

6. Do this a total of 10 times 

7. Record the number of diamonds picked. 

8. Let X = number of diamonds. Theoretically, X ~ B( ) 


Organize the Data 


1. Record the number of diamonds picked for your class in the chart 
below. Then calculate the relative frequency. 


x Frequency Relative Frequency 


2. Calculate the following: 


o at = 


o bs = 
3. Construct a histogram of the empirical data. 


Relative 
Frequency 


Number of 
Diamonds 


Theoretical Distribution 


1. Build the theoretical PDF chart based on the distribution in the 
Procedure section above. 


10 


2. Calculate the following: 


3. Construct a histogram of the theoretical distribution. 


Probability 


Number of 
Diamonds 


Using the Data 


Calculate the following, rounding to 4 decimal places: 


Note:RF = relative frequency 


Use the table from the section titled "Theoretical Distribution" here: 


Use the data from the section titled "Organize the Data" here: 


© REG=3)S 


* REL Se 4) = 
¢ RF(z > 8) = 


Discussion Questions 


For questions 1. and 2., think about the shapes of the two graphs, the 
probabilities and the relative frequencies, the means, and the standard 
deviations. 


1. Knowing that data vary, describe three similarities between the graphs 
and distributions of the theoretical and empirical distributions. Use 
complete sentences. (Note: These answers may vary and still be 
correct.) 

2. Describe the three most significant differences between the graphs or 
distributions of the theoretical and empirical distributions. (Note: 
These answers may vary and still be correct.) 

3. Using your answers from the two previous questions, does it appear 
that the data fit the theoretical distribution? In 1 - 3 complete 
sentences, explain why or why not. 

4. Suppose that the experiment had been repeated 500 times. Which table 
(from "Organize the data" and "Theoretical Distributions") would you 
expect to change (and how would it change)? Why? Why wouldn’t the 
other table change? 


Lab 2: Discrete Distribution (Lucky Dice Experiment) 

This module allows students to apply concepts related to discrete 
distributions to a simple dice experiment. Students will compare empirical 
data and a theoretical distribution to determine if the game fits a discrete 
distribution. This experiment involves the concept of long-term 
probabilities. 


Class Time: 


Names: 


Student Learning Outcomes: 
e The student will compare empirical data and a theoretical distribution 
to determine if a Tet gambling game fits a discrete distribution. 


e The student will demonstrate an understanding of long-term 
probabilities. 


Supplies: 


e 1 game “Lucky Dice” or 3 regular dice 


Note:For a detailed game description, refer here. (The link goes to the 
beginning of Discrete Random Variables Homework. Please refer to 
Problem #14.) 


Note:Round relative frequencies and probabilities to four decimal places. 


The Procedure 


1. The experiment procedure is to bet on one object. Then, roll 3 Lucky 
Dice and count the number of matches. The number of matches will 
decide your profit. 

2. What is the theoretical probability of 1 die matching the object? 

3. Choose one object to place a bet on. Roll the 3 Lucky Dice. Count the 
number of matches. 

4. Let X = number of matches. Theoretically, X~B ( ) 

5. Let Y = profit per game. 


Organize the Data 
In the chart below, fill in the Y value that corresponds to each X value. 
Next, record the number of matches picked for your class. Then, calculate 


the relative frequency. 


1. Complete the table. 


x y Frequency Relative Frequency 


2. Calculate the Following: 


o0oaxt= 
Oo hs = 


O° cy = 
10) d sy = 


3. Explain what x represents. 
4. Explain what y represents. 
5. Based upon the experiment: 


o aWhat was the average profit per game? 
o bDid this represent an average win or loss per game? 


© cHow do you know? Answer in complete sentences. 


6. Construct a histogram of the empirical data 


Relative Frequency 


Number of Matches 


Theoretical Distribution 


Build the theoretical PDF chart for _X and Y based on the distribution from 
the section titled "The Procedure". 


2. Calculate the following 


O° a py = 
oa i beg eee— 


e) C fy = 


3. Explain what jz, represents. 
4. Explain what 1, represents. 
5. Based upon theory: 


o aWhat was the expected profit per game? 

o bDid the expected profit represent an average win or loss per 
game? 

© cHow do you know? Answer in complete sentences. 


6. Construct a histogram of the theoretical distribution. 


Probability 


Number of Matches 


Use the Data 


Calculate the following (rounded to 4 decimal places): 


Note:RF = relative frequency 


Use the data from the section titled "Theoretical Distribution" here: 


LPeS3)= 
LP0<2<3)=> 
21 (e 2 2) = 


Use the data from the section titled "Organize the Data" here: 


1.RE(e=3) = 
2,RE(O < 2 < 3)= 
Ree 22) = 


Discussion Question 


For questions 1. and 2., consider the graphs, the probabilities and relative 
frequencies, the means and the standard deviations. 


1. Knowing that data vary, describe three similarities between the graphs 
and distributions of the theoretical and empirical distributions. Use 
complete sentences. (Note: these answers may vary and still be 
correct.) 

2. Describe the three most significant differences between the graphs or 
distributions of the theoretical and empirical distributions. (Note: these 
answers may vary and still be correct.) 

3. Thinking about your answers to 1. and 2.,does it appear that the data fit 
the theoretical distribution? In 1 - 3 complete sentences, explain why 
or why not. 

4. Suppose that the experiment had been repeated 500 times. Which table 
(from "Organize the Data" or "Theoretical Distribution") would you 
expect to change? Why? How might the table change? 


Binomial Distribution 


When you flip a coin, there are two possible outcomes: heads and tails. 
Each outcome has a fixed probability, the same from trial to trial. In the 
case of coins, heads and tails each have the same probability of 1/2. More 
generally, there are situations in which the coin is biased, so that heads and 
tails have different probabilities. In the present section, we consider 
probability distributions for which there are just two possible outcomes 
with fixed probability summing to one. These distributions are called are 
called binomial distributions. 


A Simple Example 


The four possible outcomes that could occur if you flipped a coin twice are 
listed in [link]. Note that the four outcomes are equally likely: each has 
probability 1/4. To see this, note that the tosses of the coin are independent 
(neither affects the other). Hence, the probability of a head on Flip 1 anda 
head on Flip 2 is the product of Pr|H] and Pr[.H], which is 

1/2 x 1/2 = 1/4. The same calculation applies to the probability of a head 
on Flip one and a tail on Flip 2. Each is 1/2 x 1/2 = 1/4. 


Outcome First Flip Second Flip 
i Heads Heads 

2 Heads Tails 

3 Tails Heads 

4 Tails Tails 


Four Possible Outcomes 


The four possible outcomes can be classifid in terms of the number of heads 
that come up. The number could be two (Outcome 1), one (Outcomes 2 and 
3) or 0 (Outcome 4). The probabilities of these possibilities are shown in 
[link] and in [link]. Since two of the outcomes represent the case in which 
just one head appears in the two tosses, the probability of this event is equal 
to1/4+ 1/4 = 1/2. [link] summarizes the situation. 


Number of Heads Probability 
0 1/4 
1 1/2 
2 1/4 


Probabilities of Getting 0,1, or 2 heads. 


0.55 
0.50 
0.45 
0.40 
0.35 
0.30 
0.25 
0.20 
0.15 
0.10 
0.05 
0.00 


Probability 


1 
Number of Heads 


Probabilities of 0, 1, and 2 heads. 


[link] is a discrete probability distribution: It shows the probability for each 
of the values on the X-axis. Defining a head as a "success," [link] shows the 
probability of 0, 1, and 2 successes for two trials (flips) for an event that has 
a probability of 0.5 of being a success on each trial. This makes [link] an 
example of a binomial distribution. 


The Formula for Binomial Probabilities 


The binomial distribution consists of the probabilities of each of the 
possible numbers of successes on WV trials for independent events that each 
have a probability of (the Greek letter pi) of occurring. For the coin flip 
example, N = 2 and xz = 0.5. The formula for the binomial distribution is 
shown below: 


N! 


z!(N — 2)! aha) ~ 


Priz| = 


where Pr|z] is the probability of x successes out of IN trials, N is the 
number of trials, and zis the probability of success on a given trial. 
Applying this to the coin flip example, 


2! sy. 2 

Pr{0] = ————0.5°(1 — 0.5)*~ = —1 x .25 = 0.25 
l= Faso? | ) 2 
2! 2 

Pr(1] = ————_0.51(1 — 0.5)?" = —.5 x .5 = 0.50 
1!(2 — 1)! 1 
2! x 2 

Pr([2] = ————0.5""(1 — 0.5)? * = —.25 x 1 = 0.25 
HOD) 2 


If you flip a coin twice, what is the probability of getting one or more 
heads? Since the probability of getting exactly one head is 0.50 and the 
probability of getting exactly two heads is 0.25, the probability of getting 
one or more heads is 0.50 + 0.25 = 0.75. 


Now suppose that the coin is biased. The probability of heads is only 0.4. 
What is the probability of getting heads at least once in two tosses? 


Substituting into our general formula above, you should obtain the answer 
64. 


Cumulative Probabilities 


We toss a coin 12 times. What is the probability that we get from 0 to 3 
heads? The answer is found by computing the probability of exactly 0 
heads, exactly 1 head, exactly 2 heads, and exactly 3 heads. The probability 
of getting from 0 to 3 heads is then the sum of these probabilities. The 
probabilities are: 0.0002, 0.0029, 0.0161, and 0.0537. The sum of the 
probabilities is 0.073. The calculation of cumulative binomial probabilities 
can be quite tedious. Therefore we have provided a binomial calculator to 
make it easy to calculate these probabilities. 


Note: Click here for the binomial calculator. 


Mean and Standard Deviation of Binomial Distributions 


Consider a coin-tossing experiment in which you tossed a coin 12 times and 
recorded the number of heads. If you performed this experiment over and 
over again, what would the mean number of heads be? On average, you 
would expect half the coin tosses to come up heads. Therefore the mean 
number of heads would be 6. In general, the mean of a binomial distribution 
with parameters NV (the number of trials) and 7 (the probability of success 
for each trial) is: 


m— Nr 


where m is the mean of the binomial distribution. The variance of the 
binomial distribution is: 


s* = Nr(1—7) 


where s? is the variance of the binomial distribution. 


Let's return to the coin tossing experiment. The coin was tossed 12 times so 
N = 12. A coin has a probability of 0.5 of coming up heads. Therefore, 

am = 0.5. The mean and standard deviation can therefore be computed as 
follows: 


m=]NrH= 12x =6 
s* = Nx(1—7) = 12x 0.5 x (1.0 — 0.5) = 3.0 


Naturally, the standard deviation (s) is the square root of the variance (s?). 


Binomial Calculator 


Applet failed to run. No Java plug-in was found. 


Glossary 


binomial distributions 
A probability distribution for independent events for which there 
are only two possible outcomes such as a coin flip. If one of the two 
outcomes is defined as a success, then the probability of exactly x 
successes out of NV trials (events) is given by: 


Priz| = aoe =n)" 


where 7 is the probability of success one one trial. 


conditional probability 
The probability that event A occurs given that event B has already 
occurred is called the conditional probability of A given B. 
Symbolically, this is written as Pr[A | B]. The probability it rains on 
Monday given that it rained on Sunday would be written as 
Pr (Rain on Sunday, Rain on Monday). 


continuous variables 


Variables that can take on any value in a certain range. Time and 
distance are continuous; gender, SAT score and "time rounded to the 
nearest second" are not. Variables that are not continuous are known 
asdiscrete variables. No measured variable is truly continuous; 
however, discrete variables measured with enough precision can often 
be considered continuous for practical purposes. 


discrete 
Variables that can only take on a finite number of values are called 
"discrete variables." All qualitative variables are discrete. Some 
quantitative variables are discrete, such as performance rated as 
1,2,3,4, or 5, or temperature rounded to the nearest degree. Sometimes, 
a variable that takes on enough discrete values can be considered to be 
continuous for practical purposes. One example is time to the nearest 
millisecond. Variables that can take on an infinite number of possible 
values are called continuous variables. 


independent events 
Intuitively, two events A and B are independent if the occurrence of 
one has no effect on the probability of the occurrence of the other. For 
example, if you throw two dice, the probability that the second one 
comes up 1 is independent of whether the first die came up 1. 
Formally, this can be stated in terms of conditional probabilities: 
Pr[A | B] = Pr[A] and Pr[B | A] = Pr|B]. 


levels of measurement 
Measurement scales differ in their level of measurement. There are 
four common levels of measurement: 


1. Nominal scales are only labels. 

2. Ordinal Scales are ordered but are not truly quantitative. Equal 
intervals on the ordinal scale do not imply equal intervals on the 
underlying trait. 

3. Interval scales are are ordered and equal intervals equal intervals 
on the underlying trait. However, interval scales do not have a 
true zero point. 

4. Ratio scales are interval scales that do have a true zero point. 
With ratio scales, it is sensible to talk about one value being twice 


as large as another, for example. 


nominal scale 
A nominal scale is one of four Levels of Measurement. No ordering 
is implied, and addition/subtraction and multiplication/division would 
be inappropriate for a variable on a nominal scale. {Female, Male} 
and {Buddhist, Christian, Hindu, Muslim} have no natural 
ordering (except alphabetic). Occasionally, numeric values are 
nominal: for instance, if a variable was coded as Female=1, Male=2, 
the set {1, 2} is still nominal. 


ordinal scale 
One of four levels of measurement, an ordinal scale is a set of ordered 
values. However, there is no set distance between scale values. For 
instance, for the scale: (Very Poor, Poor, Average, Good, Very Good) is 
an ordinal scale. You can assign numerical values to an ordinal scale: 
rating performance such as 1 for "Very Poor," 2 for "Poor," etc, but 
there is no assurance that the difference between a score of 1 and 2 
means the same thing as the difference between a score of and 3. 


probability distribution 
For a discrete random variable, a probability distribution contains 
the probability of each possible outcome. The sum of all probabilities 
is always 1.0. 


qualitative variables 

Categorical Variable 

Also known as categorical variables, qualitative variables are variables 
with no natural sense of ordering. For instance, hair color (Black, Brown, 
Gray, Red, Yellow) is a qualitative variable, as is name (Adam, Becky, 
Christina, Dave . . .). Qualitative variables can be coded to appear numeric 
but their numbers are meaningless, as in male=1, female=2. Variables that 
are not qualitative are known as quantitative variables. 


quantitative variables 


Variables that have are measured on a numeric or quantitative scale. 
Ordinal, interval and ratio scales are quantitative. A country's 
population, a person's shoe size, or a car's speed are all quantitative 
variables. Variables that are not quantitative are known as qualitative 
variables. 


ratio scale 
One of the four basic levels of measurement, a ratio scale is a 
numerical scale with a true zero point and in which a given size 
interval has the same interpretation for the entire scale. Weight is a 
ratio scale, Therefore it is meaningful to say that a 200 pound person 
weighs twice as much as a 100 pound person. 


variables 
Something that can take on different values. For example, different 
subjects in an experiment weight different amounts. Therefore 
"weight" is a variable in the experiment. Or, subjects may be given 
different doses of a drug. This would make "dosage" a variable. 
Variables can be dependent or independent, qualitative or 
quantitative, and continuous or discrete. 


Continuous Random Variables 

Continuous Random Variables: Introduction is part of the collection 
col10555 written by Barbara Illowsky and Susan Dean and serves as an 
introduction to the uniform and exponential distributions with contributions 
from Roberta Bloom. 


Student Learning Outcomes 
By the end of this chapter, the student should be able to: 


e Recognize and understand continuous probability density functions in 
general. 

e Recognize the uniform probability distribution and apply it 
appropriately. 

e Recognize the exponential probability distribution and apply it 
appropriately. 


Introduction 


Continuous random variables have many applications. Baseball batting 
averages, IQ scores, the length of time a long distance telephone call lasts, 
the amount of money a person carries, the length of time a computer chip 
lasts, and SAT scores are just a few. The field of reliability depends on a 
variety of continuous random variables. 


This chapter gives an introduction to continuous random variables and the 
many continuous distributions. We will be studying these continuous 
distributions for several chapters. 


Note:The values of discrete and continuous random variables can be 
ambiguous. For example, if X is equal to the number of miles (to the 
nearest mile) you drive to work, then X is a discrete random variable. You 
count the miles. If X is the distance you drive to work, then you measure 
values of X and_X is a continuous random variable. How the random 
variable is defined is very important. 


Properties of Continuous Probability Distributions 


The graph of a continuous probability distribution is a curve. Probability is 
represented by area under the curve. 


The curve is called the probability density function (abbreviated: pdf). 
We use the symbol f() to represent the curve. f(x) is the function that 
corresponds to the graph; we use the density function f(a) to draw the 
graph of the probability distribution. 


Area under the curve is given by a different function called the 
cumulative distribution function (abbreviated: cdf). The cumulative 
distribution function is used to evaluate probability as area. 


e The outcomes are measured, not counted. 

e The entire area under the curve and above the x-axis is equal to 1. 

e Probability is found for intervals of x values rather than for individual 
x values. 

¢ P(c < x < d) is the probability that the random variable X is in the 
interval between the values c and d. P(c < x < d) is the area under 
the curve, above the x-axis, to the right of c and the left of d. 

e P(x = c) = 0 The probability that x takes on any single individual 
value is 0. The area below the curve, above the x-axis, and between 
x=c and x=c has no width, and therefore no area (area = 0). Since the 
probability is equal to the area, the probability is also 0. 


We will find the area that represents probability by using geometry, 
formulas, technology, or probability tables. In general, calculus is needed to 
find the area under the curve for many probability density functions. When 
we use formulas to find the area in this textbook, the formulas were found 
by using the techniques of integral calculus. However, because most 
students taking this course have not studied calculus, we will not be using 
calculus in this textbook. 


There are many continuous probability distributions. When using a 
continuous probability distribution to model probability, the distribution 
used is selected to best model and fit the particular situation. 


In this chapter and the next chapter, we will study the uniform distribution, 
the exponential distribution, and the normal distribution. The following 
graphs illustrate these distributions. 


Shaded Area 
represents 


P(3<x<6) 


x 


012345678 9 10 
The Uniform Distribution 


The graph shows a Uniform 
Distribution with the area between 
x=3 and x=6 shaded to represent 
the probability that the value of 
the random variable X is in the 
interval between 3 and 6. 


Shaded Area 
represents 
P(2<x<4) 


123 4 6 6 7 x 
The Exponential Distribution 


The graph shows an 
Exponential Distribution with 


the area between x=2 and x=4 
shaded to represent the 
probability that the value of the 
random variable X is in the 
interval between 2 and 4. 


Shaded area 
represents 
probability 


P(1<x<2) 


-3 -2 = 0 1 2 5 
The Normal Distribution x 


The graph shows the Standard Normal Distribution with 
the area between x=1 and x=2 shaded to represent the 
probability that the value of the random variable X is in 
the interval between 1 and 2. 


** With contributions from Roberta Bloom 


Glossary 


Uniform Distribution 
A continuous random variable (RV) that has equally likely outcomes 
over the domain, a < x < b. Often referred as the Rectangular 
distribution because the graph of the pdf has the form of a rectangle. 
Notation: X~U(a,b). The mean is p = ath and the standard deviation 


= p\2 
iso = / e > Y The probability density function is f(X) = 3 for 
a<a«<bora<z < b. The cumulative distribution is 

—_ «t~—a 
P(X <2#)=—. 

Exponential Distribution 

A continuous random variable (RV) that appears when we are 
interested in the intervals of time between some random events, for 
example, the length of time between emergency arrivals at a hospital. 
Notation: X~Exp(m). The mean is ps = —- and the standard deviation 


is o = +. The probability density function is f(x) = me“™, z > 0 
and the cumulative distribution function is P(X < #«) =1—e™. 


Continuous Probability Functions 

This module introduces the continuous probability function and explores 
the relationship between the probability of X and the area under the curve 
of f(X). 


We begin by defining a continuous probability density function. We use the 
function notation f(z). Intermediate algebra may have been your first 
formal introduction to functions. In the study of probability, the functions 
we study are special. We define the function f(x) so that the area between 
it and the x-axis is equal to a probability. Since the maximum probability is 
one, the maximum area is also one. 


For continuous probability distributions, PROBABILITY = AREA. 


Example: 
Consider the function f(x) = _ for 0 < x < 20. x = areal number. The 
graph of f(x) = = is a horizontal line. However, since 0 < z < 20, 


f(a) is restricted to the portion between x = 0 and x = 20, inclusive . 


] 
f(x) =— 
: 20 


> 


i) — an for 0< x <20. 
The graph of f(x) = 36 is a horizontal line segment when 0 < x < 20. 


The area between f(x) = ao where 0 < x < 20 and the x-axis is the area 
of a rectangle with base = 20 and height =3. 

AREA = 20-3 =1 

This particular function, where we have restricted x so that the area 
between the function and the x-axis is 1, is an example of a continuous 
probability density function. It is used as a tool to calculate probabilities. 
Suppose we want to find the area between f(x) = — and the x-axis 


where 0 < xz < 2. 


P| 
20 3 


NBCU es (0 10))) 2 ee (0) 

(2 — 0) = 2 = base of a rectangle 

35 = the height. 

The area corresponds to a probability. The probability that x is between 0 
and 2 is 0.1, which can be written mathematically as 

IE se) = EA ee 7))) = (0) IL. 

Suppose we want to find the area between f(x) = — and the x-axis 
where 4< 24 < 15. 


AREA = (15 — 4) - 3 = 0.55 

(15 — 4) = 11 = the base of a rectangle 

30 = the height. 

The area corresponds to the probability P(4 < 2 < 15) = 0.55. 
Suppose we want to find P(x =15 ). On an x-y graph, x=15 is a vertical 


line. A vertical line has no width (or 0 width). Therefore, 
P(x=15) = (base) (height) = (0) (55) = 0. 


f(x) 


P(X < «) (can be written as P(X < x) for continuous distributions) is 
called the cumulative distribution function or CDF. Notice the "less than 
or equal to" symbol. We can use the CDF to calculate P(X > x) . The 
CDF gives "area to the left" and P(X > x) gives "area to the right." We 
calculate P(X > x) for continuous distributions as follows: 

VA ie | I ZIO.G ce ta), 


f(x) 


x 
P(X < x) P(X > x) = 1-P(X < x) 


Label the graph with f(x) and . Scale the x and y axes with the maximum 


x and y values. f(x) = = 0<2 < 20. 


f(x) 


0 2.3 12.7 x 


P(2.3 < @ < 12.7) = (base)(height) = (12.7 — 2.3) (4) = 0.52 


The Uniform Distribution 

Continuous Random Variable: Uniform Distribution is part of the collection col10555 written by Barbara 
Illowsky and Susan Dean. It describes the properties of the Uniform Distribution with contributions from 
Roberta Bloom. 


Example: 

The previous problem is an example of the uniform probability distribution. 

Illustrate the uniform distribution. The data that follows are 55 smiling times, in seconds, of an eight- 
week old baby. 


10.4 19.6 18.8 13.9 17.8 16.8 21.6 17.9 IDES 11.1 4.9 
12.8 14.8 22.8 20.0 15.9 16.3 13.4 17.1 14.5 19.0 22.8 
ES 0.7 8.9 11.9 10.9 Tee) 5.9 3.7 WES 19.2 9.8 
5.8 6.9 2.6 5.8 21.7 11.8 3.4 2.1 4.5 6.3 10.7 


8.9 9.4 9.4 7.6 10.0 3.3 6.7 7.8 11.6 13.8 18.6 


sample mean = 11.49 and sample standard deviation = 6.23 

We will assume that the smiling times, in seconds, follow a uniform distribution between 0 and 23 
seconds, inclusive. This means that any smiling time from 0 to and including 23 seconds is equally likely. 
The histogram that could be constructed from the sample is an empirical distribution that closely matches 
the theoretical uniform distribution. 

Let X = length, in seconds, of an eight-week old baby's smile. 

The notation for the uniform distribution is 

X ~ U(a,b) where a = the lowest value of z and b = the highest value of a. 

The probability density function is f(z) = a fora<a<b. 


For this example, x ~ U(0,23) and f(x) = s3+5 for 0< x <23. 


Formulas for the theoretical mean and standard deviation are 


—q)2 
y= Hand o = yf 88 
For this problem, the theoretical mean and standard deviation are 


ae 23—0)2 
— oes = 11.50 seconds and o = ee = 6.64 seconds 
Notice that the theoretical mean and standard deviation are close to the sample mean and standard 


deviation. 


Example: 
Exercise: 


Problem: 


What is the probability that a randomly chosen eight-week old baby smiles between 2 and 18 
seconds? 


Solution: 
Find P(2< a < 18): 


P(2 < @ < 18) = (base)(height) = (18 — 2) - = = <2. 


f(x) 


Exercise: 


Problem: Find the 90th percentile for an eight week old baby's smiling time. 

Solution: 

Ninety percent of the smiling times fall below the 90th percentile, k, so P(x < k) = 0.90 
P(e =k) — 0:90 

(base) (height) = 0.90 

(k— 0) - # =0.90 


k= 23+ 0.90 = 20:7 


f(x) AREA = P(X <k) = 0.90 


Exercise: 


Problem: 


Find the probability that a random eight week old baby smiles more than 12 seconds KNOWING 
that the baby smiles MORE THAN 8 SECONDS. 


Solution: 


Find P(x > 12|x > 8) There are two ways to do the problem. For the first way, use the fact that 
this is a conditional and changes the sample space. The graph illustrates the new sample space. You 
already know the baby smiled more than 8 seconds. 


Write anew f(z): f(z) = 5 = is 


tor << a < WS} 


P(a > 12|2 > 8) = (23-12). + =# 


f(x) 1 


0 8 12 ys a 
For the second way, use the conditional formula from Probability Topics with the original 
distribution X ~ U(0, 23): 


P(A|B) = eee For this problem, A is (2 > 12) and Bis (x > 8). 


So, P(x > 12|e > 8) = “ESR = pas = BE = 0.733 


02 4 6 8 10 12 14 16 18 20 22 24 x 


Example: 

Uniform: The amount of time, in minutes, that a person must wait for a bus is uniformly distributed 
between 0 and 15 minutes, inclusive. 

Exercise: 


Problem: What is the probability that a person waits fewer than 12.5 minutes? 


Solution: 


Let X = the number of minutes a person must wait for a bus. a = 0 and b= 15. x~U(0, 15). Write 
the probability density function. f(z) = —— = + for 0< x <15. 


15—0 15 
Find P(a < 12.5). Draw a graph. 
P(z < k) = (base) (height) = (12.5 — 0) - = 0.8333 


The probability a person waits less than 12.5 minutes is 0.8333. 


f(x) 
1 


Exercise: 


Problem: On the average, how long must a person wait? 


Find the mean, pz, and the standard deviation, o. 


Solution: 


jy = OBES yo 7.5. On the average, a person must wait 7.5 minutes. 


2 
oc ne = / ——— = 4.3. The Standard deviation is 4.3 minutes. 


Exercise: 


Problem: Ninety percent of the time, the time a person must wait falls below what value? 


Note: This asks for the 90th percentile. 


Solution: 


Find the 90th percentile. Draw a graph. Let k = the 90th percentile. 


P(x < k) = (base) (height) = (k — 0) - (+) 


0.90 =k- +* 
k= (0:90)(15)— 135 
k is sometimes called a critical value. 


The 90th percentile is 13.5 minutes. Ninety percent of the time, a person must wait at most 13.5 
minutes. 


f(x) AREA = P(x <k) = 0.90 
+ eter | 
15 
0 _ = 
Example: 


Uniform: Suppose the time it takes a nine-year old to eat a donut is between 0.5 and 4 minutes, inclusive. 
Let X = the time, in minutes, it takes a nine-year old child to eat a donut. Then X ~ U(0.5, 4). 
Exercise: 


Problem: 


The probability that a randomly selected nine-year old child eats a donut in at least two minutes is 


Solution: 


0.5714 
Exercise: 
Problem: 


Find the probability that a different nine-year old child eats a donut in more than 2 minutes given that 
the child has already been eating the donut for more than 1.5 minutes. 


The second probability question has a conditional (refer to "Probability Topics"). You are asked to 
find the probability that a nine-year old child eats a donut in more than 2 minutes given that the child 
has already been eating the donut for more than 1.5 minutes. Solve the problem two different ways 
(see the first example). You must reduce the sample space. First way: Since you already know the 
child has already been eating the donut for more than 1.5 minutes, you are no longer starting at 

a = 0.5 minutes. Your starting point is 1.5 minutes. 


Write a new f(x): 


iG) = T= for l5< 2 <4. 


Find P(a > 2|a > 1.5). Draw a graph. 


f(x) 


P(x > 2|a > 1.5) = (base)(new height) = (4 — 2)(2/5) =? 


Solution: 


4 


5 


The probability that a nine-year old child eats a donut in more than 2 minutes given that the child has 


already been eating the donut for more than 1.5 minutes is z. 


Second way: Draw the original graph for z ~ U(0.5, 4). Use the conditional formula 


_ P(z>2ANDa>15) _ P(a>2) = _ a 


s 
of 


Note:See "Summary of the Uniform and Exponential Probability Distributions" for a full summary. 


Example: 
Uniform: Ace Heating and Air Conditioning Service finds that the amount of time a repairman needs to 


fix a furnace is uniformly distributed between 1.5 and 4 hours. Let x = the time needed to fix a furnace. 
Then « ~ U(1.5, 4). 


1. Find the problem that a randomly selected furnace repair requires more than 2 hours. 

2. Find the probability that a randomly selected furnace repair requires less than 3 hours. 

3. Find the 30th percentile of furnace repair times. 

4. The longest 25% of repair furnace repairs take at least how long? (In other words: Find the 
minimum time for the longest 25% of repair times.) What percentile does this represent? 

5. Find the mean and standard deviation 


Exercise: 


Problem: Find the probability that a randomly selected furnace repair requires longer than 2 hours. 
Solution: 


Toting 7 (a.)o9 (a) — an = ¥e so f(x) =0.4 


P(x>2) = (base)(height) = (4 — 2)(0.4) = 0.8 


Example 4 Figure 1 
f(x) P(x>2) 


0.4 


Uniform Distribution 
between 1.5 and 4 with 
shaded area between 2 and 4 
representing the probability 
that the repair time x is 
greater than 2 


Exercise: 


Problem: 


Find the probability that a randomly selected furnace repair requires less than 3 hours. Describe how 
the graph differs from the graph in the first part of this example. 


Solution: 
P(a < 3) = (base)(height) = (3 — 1.5)(0.4) = 0.6 
The graph of the rectangle showing the entire distribution would remain the same. However the 


graph should be shaded between x=1.5 and x=3. Note that the shaded area starts at x=1.5 rather than 
at x=0; since X~U(1.5,4), x can not be less than 1.5. 


Example 4 Figure 2 
f(x) P(x<3) 
0.4 


o Wek Fg «a 


Uniform Distribution 
between 1.5 and 4 with 
shaded area between 1.5 and 
3 representing the probability 
that the repair time x is less 
than 3 


Exercise: 


Problem: Find the 30th percentile of furnace repair times. 


Solution: 


Example 4 Figure 3 


(x) Area = P(X<k) = 0.3 


Ra 
0.4 


0 15° k 4 x 


Uniform Distribution between 

1.5 and 4 with an area of 0.30 

shaded to the left, representing 
the shortest 30% of repair times. 


P(a < k) = 0.30 
P(x < k) = (base)(height) = (k — 1.5) - (0.4) 


e 0.3 = (k — 1.5) (0.4) ; Solve to find k: 
e 0.75 =k-— 1.5, obtained by dividing both sides by 0.4 
e k= 2.25, obtained by adding 1.5 to both sides 
The 30th percentile of repair times is 2.25 hours. 30% of repair times are 2.5 hours or less. 


Exercise: 


Problem: 


The longest 25% of furnace repair times take at least how long? (Find the minimum time for the 
longest 25% of repairs.) 


Solution: 
Example 4 Figure 4 
f(x) Area=P(X>k) = 0.25 
Vv 
0.4 
0 15 k 4 x 


Uniform Distribution between 
1.5 and 4 with an area of 0.25 
shaded to the right representing 
the longest 25% of repair times. 


P(x > k) = 0.25 
P(x > k) = (base)(height) = (4 — k) - (0.4) 


e 0.25 = (4 — k)(0.4) ; Solve for k: 


e 0.625 = 4—k, obtained by dividing both sides by 0.4 
e —3.375 = —-k, obtained by subtracting 4 from both sides 
e k=3.375 


The longest 25% of furnace repairs take at least 3.375 hours (3.375 hours or longer). 


Note: Since 25% of repair times are 3.375 hours or longer, that means that 75% of repair times are 
3.375 hours or less. 3.375 hours is the 75th percentile of furnace repair times. 


Exercise: 


Problem: Find the mean and standard deviation 


Solution: 
— a+b = (b—a)? 
p= 73 ando = o 


= Ja = 2.75 hours and o = + ae = 0.7217 hours 


Note:See "Summary of the Uniform and Exponential Probability Distributions" for a full summary. 


**Example 5 contributed by Roberta Bloom 


Glossary 


Conditional Probability 
The likelihood that an event will occur given that another event has already occurred. 


Uniform Distribution 
A continuous random variable (RV) that has equally likely outcomes over the domain, a < x < b. 
Often referred as the Rectangular distribution because the graph of the pdf has the form of a 
(b-a)? 
12 


rectangle. Notation: X~U(a,b). The mean is up = ath and the standard deviation is 0 = / 
The probability density function is f(x) = ta fora <x < bora< x <b. The cumulative 


distribution is P(X < x) = =—*. 


The Exponential Distribution 
This module introduces the properties of the exponential distribution, the behavior of probabilities that reflect a 
large number of small values and a small number of high values. 


The exponential distribution is often concerned with the amount of time until some specific event occurs. For 
example, the amount of time (beginning now) until an earthquake occurs has an exponential distribution. Other 
examples include the length, in minutes, of long distance business telephone calls, and the amount of time, in 
months, a car battery lasts. It can be shown, too, that the value of the change that you have in your pocket or 
purse approximately follows an exponential distribution. 


Values for an exponential random variable occur in the following way. There are fewer large values and more 
small values. For example, the amount of money customers spend in one trip to the supermarket follows an 
exponential distribution. There are more people that spend less money and fewer people that spend large 
amounts of money. 


The exponential distribution is widely used in the field of reliability. Reliability deals with the amount of time a 
product lasts. 


Example: 

Illustrates the exponential distribution: Let X = amount of time (in minutes) a postal clerk spends with 
his/her customer. The time is known to have an exponential distribution with the average amount of time equal 
to 4 minutes. 

X is a continuous random variable since time is measured. It is given that jz = 4 minutes. To do any 
calculations, you must know m, the decay parameter. 

i. = = Therefore, m = t = (5 

The standard deviation, o, is the same as the mean. pp = 0 

The distribution notation is X~Exp(m). Therefore, X~Exp(0.25). 

The probability density function is f(a) = m-e™* The number e = 2.71828182846... It is a number that is 
used often in mathematics. Scientific calculators have the key "e*." If you enter 1 for z, the calculator will 
display the value e. 

The curve is: 

Hi) = 0.25 - e~ °-5* where z is at least 0 and m = 0.25. 

For example, f(8) = 0.25-e 9755 — 9.072 


The graph is as follows: 
f(x) 
0.25, m= 0.25 
0.27 ' 
0.157 
0.1) \ 
0.05 i : 
O}—4+4 ++ +4 4 4H 
02 4 6 8 101214161820 
x 
w=4 


Notice the graph is a declining curve. When z = 0, 
Vc OO Gc etre. 0.250.257 


Example: 
Exercise: 


Problem: Find the probability that a clerk spends four to five minutes with a randomly selected customer. 
Solution: 

Find P(4 < x < 5). 

The cumulative distribution function (CDF) gives the area to the left. 

Pat) = 

P@ 25) =) —e* — 07135 and P(e <4) — 1 eo — 0.6321 


f(x) P(4 < x <5) 
0.25 


Note: You can do these calculations easily on a calculator. 


The probability that a postal clerk spends four to five minutes with a randomly selected customer is 


P(4 <a <5) = P(a <5) — P(x < 4) = 0.7135 — 0.6321 = 0.0814 


Note: TI-83+ and TI-84: On the home screen, enter (1-e/((-.25*5))-(1-e/(-.25*4)) or enter eA(-.25*4)- 
eA(-.25*5),. 


Exercise: 


Problem: Half of all customers are finished within how long? (Find the 50th percentile) 


Solution: 


Find the 50th percentile. 


f(x) P(x <k)=0.50 


P(a < k) = 0.50, & = 2.8 minutes (calculator or computer) 
Half of all customers are finished within 2.8 minutes. 

You can also do the calculation as follows: 

P(e <k) = 0.50 and P(e <k) =1—e9?** 

Therefore, 0.50 = 1 — e 9° and e °5* = 1 — 0.50 = 0.5 


Take natural logs: In(e~°?°*) = In(0.50). So, —0.25-k = In(0.50) 


In(.50) 
~0.25 


Solve for k: k = = 2.8 minutes 


LN(1—AreaToTheLeft) 


—m 


Note:A formula for the percentile k is k = where LN is the natural log. 


Note: TI-83+ and TI-84: On the home screen, enter LN(1-.50)/-.25. Press the (-) for the negative. 


Exercise: 


Problem: Which is larger, the mean or the median? 
Solution: 
Is the mean or median larger? 


From part b, the median or 50th percentile is 2.8 minutes. The theoretical mean is 4 minutes. The mean is 
larger. 


Optional Collaborative Classroom Activity 


Have each class member count the change he/she has in his/her pocket or purse. Your instructor will record the 
amounts in dollars and cents. Construct a histogram of the data taken by the class. Use 5 intervals. Draw a 
smooth curve through the bars. The graph should look approximately exponential. Then calculate the mean. 


Let X = the amount of money a student in your class has in his/her pocket or purse. 


The distribution for X is approximately exponential with mean, pz = and m = . The standard 
deviation, o = 


Draw the appropriate exponential graph. You should label the x and y axes, the decay rate, and the mean. Shade 


the area that represents the probability that one student has less than $.40 in his/her pocket or purse. (Shade 
P(x < 0.40)). 


Example: 


On the average, a certain computer part lasts 10 years. The length of time the computer part lasts is 
exponentially distributed. 
Exercise: 


Problem: What is the probability that a computer part lasts more than 7 years? 
Solution: 

Let x = the amount of time (in years) a computer part lasts. 

= 10som= = = + —0.1 

Find P(« > 7). Draw a graph. 

Pia) — 1 P(e): 

Since P(X <2) = 1 —e" ther P(X >a) =1—(l—2e" *)=e"" 


2S) — e917 — 0.4966. The probability that a computer part lasts more than 7 years is 0.4966. 


Note: TI-83+ and TI-84: On the home screen, enter e/(-.1*7). 


f(x) P(x > 7) 
0.1 
0 7 x 
u= 10 
Exercise: 


Problem: On the average, how long would 5 computer parts last if they are used one after another? 


Solution: 


On the average, 1 computer part lasts 10 years. Therefore, 5 computer parts, if they are used one right after 
the other would last, on the average, 


(5)(10) = 50 years. 

Exercise: 
Problem: Eighty percent of computer parts last at most how long? 
Solution: 


Find the 80th percentile. Draw a graph. Let & = the 80th percentile. 


f(x) P(x < k) = 0.80 
0.1 


In(1-.80) 


Solve for k: k = —>y 


= 16.1 years 


Eighty percent of the computer parts last at most 16.1 years. 


Note: TI-83+ and TI-84: On the home screen, enter LN(1 - .80)/-.1 


Exercise: 


Problem: What is the probability that a computer part lasts between 9 and 11 years? 
Solution: 


Find P(9 < « < 11). Drawa graph. 


f(x) PQ <x<11) 
0.1 


P(Q < a < 11) = P(x < 11) — P(@ < 9) = (1—e °™) — (1-e ©*”) = 0.6671 — 0.5934 = 0.0737 
. (calculator or computer) 


The probability that a computer part lasts between 9 and 11 years is 0.0737. 


Note: TI-83+ and TI-84: On the home screen, enter e/(-.1*9) - e\(-.1*11). 


Example: 
Suppose that the length of a phone call, in minutes, is an exponential random variable with decay parameter = 


> If another person arrives at a public telephone just before you, find the probability that you will have to 
wait more than 5 minutes. Let X = the length of a phone call, in minutes. 
Exercise: 


Problem: What is m, jz, and o? The probability that you must wait more than 5 minutes is 


Solution: 


P(x > 5) = 0.6592 


Note:A summary for exponential distribution is available in "Summary of The Uniform and Exponential 
Probability Distributions". 


Glossary 


Exponential Distribution 
A continuous random variable (RV) that appears when we are interested in the intervals of time between 
some random events, for example, the length of time between emergency arrivals at a hospital. Notation: 
X~Exp(m). The mean is pp = = and the standard deviation is 0 = 4. The probability density function is 


f(x) =me"™, x > 0 and the cumulative distribution function is P(X < «) = 1—e™™. 


Summary of the Uniform and Exponential Probability Distributions 

This module provides a summary of formulas and definitions related to Continuous Random 
Variables. 

Formula 

Uniform 


X =a real number between a and b (in some instances, X can take on the values a and 6). a = 
smallest X ; b = largest X 


X ~U(a,b) 


The mean is pp = ate 


rar (b—a)? 
The standard deviation is o = 5 


Probability density function: f(X) = j+- fora < X <b 


Area to the Left of x: P(X < x) = (base) (height) 
Area to the Right of x: P(X > x) = (base) (height) 


Area Between c and d: P(c < X < d) = (base)(height) = (d — c)(height). 
Formula 
Exponential 


X ~ Exp(m) 

X =a real number, 0 or larger. m = the parameter that controls the rate of decay or decline 
The mean and standard deviation are the same. 

pao= ay and m = = - 

The probability density function: f X = m-e™*, X >0 
Area to the Left of x: P(X < x) =1—e™* 

Area to the Right of x: P(X > x) = e™* 


Area Between c and d: 
Peek <0) = Pk 2a PX < oe Lee = tae Se ee 


LN(1-AreaToTheLeft) 


Percentile, k: k = a 


Practice 1: Uniform Distribution 
In this module the student will explore the properties of data with a uniform 
distribution. 


Student Learning Outcomes 


e The student will analyze data following a uniform distribution. 


Given 


The age of cars in the staff parking lot of a suburban college is uniformly 
distributed from six months (0.5 years) to 9.5 years. 


Describe the Data 


Exercise: 


Problem: What is being measured here? 


Solution: 


The age of cars in the staff parking lot 


Exercise: 


Problem: In words, define the Random Variable X. 


Solution: 


X = The age (in years) of cars in the staff parking lot 


Exercise: 


Problem: Are the data discrete or continuous? 


Solution: 


Continuous 


Exercise: 


Problem: The interval of values for z is: 


Solution: 


0.5-9.5 


Exercise: 


Problem: The distribution for X is: 


Solution: 


X ~U(0.5,9.5) 


Probability Distribution 


Exercise: 


Problem: Write the probability density function. 


Solution: 


Exercise: 


Problem: Graph the probability distribution. 


e aSketch the graph of the probability distribution. 


e bldentify the following values: 


iLowest value for x: 
iiHighest value for z: 
iiiHeight of the rectangle: 
ivLabel for x-axis (words): 
vLabel for y-axis (words): 


oOo 0 0 0 


Solution: 


e b.i0.5 

e b.ii9.5 

° b.iiiZ 

e b.ivAge of Cars 
e bv f(z) 


Random Probability 


Exercise: 


Problem: 


Find the probability that a randomly chosen car in the lot was less than 
4 years old. 


e aSketch the graph. Shade the area of interest. 


¢ bFind the probability. P(a < 4) = 


Solution: 
: 3.5 
b. 5 
Exercise: 
Problem: 


Out of just the cars less than 7.5 years old, find the probability that a 
randomly chosen car in the lot was less than 4 years old. 


e aSketch the graph. Shade the area of interest. 


¢ bFind the probability. P(a < 4| a < 7.5) = 


Solution: 


3.5 
*.Dr=, 


Exercise: 
Discussion Question 


Problem: 
What has changed in the previous two problems that made the 
solutions different? 

Quartiles 


Exercise: 


Problem: Find the average age of the cars in the lot. 
Solution: 
b=5 

Exercise: 


Problem: 


Find the third quartile of ages of cars in the lot. This means you will 
have to find the value such that . or 75%, of the cars are at most (less 
than or equal to) that age. 


e aSketch the graph. Shade the area of interest. 


e bFind the value k such that P(a < k) = 0.75. 
e cThe third quartile is: 


Solution: 


© bk = 7.25 


Practice 2: Exponential Distribution 
In this module the student will explore the properties of data with an 
exponential distribution. 


Student Learning Outcomes 


e The student will analyze data following the exponential distribution. 


Given 
Carbon-14 is a radioactive element with a half-life of about 5730 years. 
Carbon-14 is said to decay exponentially. The decay rate is 0.000121 . We 


start with 1 gram of carbon-14. We are interested in the time (years) it takes 
to decay carbon-14. 


Describe the Data 


Exercise: 


Problem: What is being measured here? 


Exercise: 


Problem: Are the data discrete or continuous? 


Solution: 


Continuous 


Exercise: 


Problem: In words, define the Random Variable X. 


Solution: 


X = Time (years) to decay carbon-14 


Exercise: 


Problem: What is the decay rate (m)? 


Solution: 


m = 0.000121 


Exercise: 


Problem: The distribution for X is: 


Solution: 


X ~ Exp(0.000121) 


Probability 


Exercise: 


Problem: 


Find the amount (percent of 1 gram) of carbon-14 lasting less than 
5730 years. This means, find P(x < 5730). 


e aSketch the graph. Shade the area of interest. 


¢ bFind the probability. P(a < 5730) = 


Solution: 


¢ bP(x < 5730) = 0.5001 
Exercise: 
Problem: 
Find the percentage of carbon-14 lasting longer than 10,000 years. 


e aSketch the graph. Shade the area of interest. 


e bFind the probability. P(2 > 10000) = 


Solution: 
¢ bP(x > 10000) = 0.2982 
Exercise: 
Problem: 
Thirty percent (30%) of carbon-14 will decay within how many years? 


e aSketch the graph. Shade the area of interest. 


e bFind the value k such that P(x < k) = 0.30. 


Solution: 


¢ bk = 2947.73 


Homework 
This module provides a number of homework exercises related to 
Continuous Random Variables. 


For each probability and percentile problem, DRAW THE PICTURE! 
Exercise: 


Problem: 


Consider the following experiment. You are one of 100 people enlisted 
to take part in a study to determine the percent of nurses in America 
with an R.N. (registered nurse) degree. You ask nurses if they have an 
R.N. degree. The nurses answer “yes” or “no.” You then calculate the 
percentage of nurses with an R.N. degree. You give that percentage to 
your supervisor. 


e¢ aWhat part of the experiment will yield discrete data? 
e b What part of the experiment will yield continuous data? 


Exercise: 


Problem: 


When age is rounded to the nearest year, do the data stay continuous, 
or do they become discrete? Why? 


Exercise: 


Problem: 


Births are approximately uniformly distributed between the 52 weeks 
of the year. They can be said to follow a Uniform Distribution from 1 — 
53 (spread of 52 weeks). 


ea X~- 

e b Graph the probability distribution. 
ec f(z) — 

e d pt = 

ee eo = 


e f Find the probability that a person is born at the exact moment 
week 19 starts. That is, find P(a = 19) = 

© gP(2<2<31)= 

e h Find the probability that a person is born after week 40. 

eR P22 | ee 28) 

e j Find the 70th percentile. 

e k Find the minimum for the upper quarter. 


Solution: 


© a X-U(1,53) 
e272) = gy Where 1 <a < 53 


Exercise: 


Problem: 


A random number generator picks a number from 1 to 9 in a uniform 
manner. 


ea X~- 

¢ b Graph the probability distribution. 
ecf(z)= 

ed KS 

e eo = 

et P30 ox (7.25) > 

oP 5.67) = 
*hP(a@>5|e¢>3)= 


e i Find the 90th percentile. 
Exercise: 
Problem: 


The time (in minutes) until the next bus departs a major bus depot 
follows a distribution with f(a) = 35 where x goes from 25 to 45 
minutes. 


e aDefine the random variable. X = 


« bxX-~ 

e c Graph the probability distribution. 

e d The distribution is (name of distribution). It is 
(discrete or continuous). 

eeu= 

efor 


e g Find the probability that the time is at most 30 minutes. Sketch 
and label a graph of the distribution. Shade the area of interest. 
Write the answer in a probability statement. 

e h Find the probability that the time is between 30 and 40 minutes. 
Sketch and label a graph of the distribution. Shade the area of 
interest. Write the answer in a probability statement. 

1 POQ5°< ae < 55) = . State this in a probability 
statement (similar to g and h ), draw the picture, and find the 
probability. 

e j Find the 90th percentile. This means that 90% of the time, the 
time is less than minutes. 

e k Find the 75th percentile. In a complete sentence, state what this 
means. (See j.) 

e | Find the probability that the time is more than 40 minutes given 
(or knowing that) it is at least 30 minutes. 


Solution: 


e b X-U(25,45) 
e d uniform; continuous 


e e 35 minutes 
e £5.8 minutes 
e g0.25 

e ho.5 

e il 

e j 43 minutes 
e k 40 minutes 
e 10.3333 


Exercise: 


Problem: 


According to a study by Dr. John McDougall of his live-in weight loss 
program at St. Helena Hospital, the people who follow his program 
lose between 6 and 15 pounds a month until they approach trim body 
weight. Let’s suppose that the weight loss is uniformly distributed. We 
are interested in the weight loss of a randomly selected individual 
following the program for one month. (Source: The McDougall 
Program for Maximum Weight Loss by John A. McDougall, M.D.) 


e aDefine the random variable. X = 

© bxX-~ 

e c Graph the probability distribution. 

ed f(z) = 

ee pL == 

e@ f Oo — 

e g Find the probability that the individual lost more than 10 
pounds in a month. 

e h Suppose it is known that the individual lost more than 10 
pounds in a month. Find the probability that he lost less than 12 
pounds in the month. 


ae ed i caer a a ra . State this ina 
probability question (similar to g and h), draw the picture, and 
find the probability. 


Exercise: 


Problem: 


A subway train on the Red Line arrives every 8 minutes during rush 
hour. We are interested in the length of time a commuter must wait for 
a train to arrive. The time follows a uniform distribution. 


e aDefine the random variable. X = 
e bX-~- 
e c Graph the probability distribution. 


© d f(z) = 


e g Find the probability that the commuter waits less than one 
minute. 

e h Find the probability that the commuter waits between three and 
four minutes. 

e 160% of commuters wait more than how long for the train? State 
this in a probability question (similar to g and h), draw the 
picture, and find the probability. 


Solution: 


where0 <2 <8 


Exercise: 


Problem: 


The age of a first grader on September 1 at Garden Elementary School 
is uniformly distributed from 5.8 to 6.8 years. We randomly select one 
first grader from the class. 


aDefine the random variable. X = 
b X~ 

c Graph the probability distribution. 
d f(x) = 


g Find the probability that she is over 6.5 years. 

h Find the probability that she is between 4 and 6 years. 

i Find the 70th percentile for the age of first graders on September 
1 at Garden Elementary School. 


Exercise: 


Problem: Let X ~Exp(0.1) 


a decay rate= 

bu= 

c Graph the probability distribution function. 

d On the above graph, shade the area corresponding to P(x < 6) 
and find the probability. 

e Sketch a new graph, shade the area corresponding to 

P(3 < x < 6) and find the probability. 

f Sketch a new graph, shade the area corresponding to P(x > 7) 
and find the probability. 

g Sketch a new graph, shade the area corresponding to the 40th 
percentile and find the value. 

h Find the average value of z. 


Solution: 


e a0.1 

« b 10 

e d 0.4512 
e e 0.1920 
e {0.4966 
e g5.11 

e h10 


Exercise: 


Problem: 


Suppose that the length of long distance phone calls, measured in 
minutes, is known to have an exponential distribution with the average 
length of a call equal to 8 minutes. 


e aDefine the random variable. X = 

e bIs X continuous or discrete? 

°c X~- 

ed b= 

e eo = 

e f Draw a graph of the probability distribution. Label the axes. 

e g Find the probability that a phone call lasts less than 9 minutes. 

e h Find the probability that a phone call lasts more than 9 minutes. 

e i Find the probability that a phone call lasts between 7 and 9 
minutes. 

e j If 25 phone calls are made one after another, on average, what 
would you expect the total to be? Why? 


Exercise: 
Problem: 


Suppose that the useful life of a particular car battery, measured in 
months, decays with parameter 0.025. We are interested in the life of 
the battery. 


e aDefine the random variable. X = 


b Is X continuous or discrete? 

c X~ 

d On average, how long would you expect 1 car battery to last? 

e On average, how long would you expect 9 car batteries to last, if 
they are used one after another? 

e f Find the probability that a car battery lasts more than 36 months. 
¢ g 70% of the batteries last at least how long? 


Solution: 


¢ c X~Exp(0.025) 
e d 40 months 

e 360 months 
f0.4066 

e g 14.27 


Exercise: 


Problem: 


The percent of persons (ages 5 and older) in each state who speak a 
language at home other than English is approximately exponentially 
distributed with a mean of 9.848 . Suppose we randomly pick a state. 
(Source: Bureau of the Census, U.S. Dept. of Commerce) 


e aDefine the random variable. X = 

e bIs X continuous or discrete? 

°c X~ 

ed i 

ecg= 

f Draw a graph of the probability distribution. Label the axes. 
g Find the probability that the percent is less than 12. 

h Find the probability that the percent is between 8 and 14. 

i The percent of all individuals living in the United States who 
speak a language at home other than English is 13.8 . 


o i Why is this number different from 9.848%? 


© ji What would make this number higher than 9.848%? 


Exercise: 


Problem: 


The time (in years) after reaching age 60 that it takes an individual to 
retire is approximately exponentially distributed with a mean of about 
5 years. Suppose we randomly pick one retired individual. We are 
interested in the time after age 60 to retirement. 


e aDefine the random variable. X = 

¢ bIs X continuous or discrete? 

°c X~- 

ed Lis 

e e Oo —s 

e f Draw a graph of the probability distribution. Label the axes. 

e g Find the probability that the person retired after age 70. 

¢ h Do more people retire before age 65 or after age 65? 

¢ iInaroom of 1000 people over age 80, how many do you expect 
will NOT have retired yet? 


Solution: 


¢ c X-Exp(<) 
e d5 

ee5 

¢ 90.1353 

e h Before 

e 118.3 


Exercise: 
Problem: 


The cost of all maintenance for a car during its first year is 
approximately exponentially distributed with a mean of $150. 


aDefine the random variable. X = 

b X~ 

Ch= 

da= 

e e Draw a graph of the probability distribution. Label the axes. 
e f Find the probability that a car required over $300 for 
maintenance during its first year. 


Try these multiple choice problems 


The next three questions refer to the following information. The average 
lifetime of a certain new cell phone is 3 years. The manufacturer will 
replace any cell phone failing within 2 years of the date of purchase. The 
lifetime of these cell phones is known to follow an exponential distribution. 
Exercise: 


Problem: The decay rate is 


¢ A 0.3333 
e B0.5000 
e C 2.0000 
¢ D 3.0000 


Solution: 


A 
Exercise: 


Problem: 


What is the probability that a phone will fail within 2 years of the date 
of purchase? 


e A 0.8647 
e B 0.4866 


e C 0.2212 
e d 0.9997 


Solution: 


B 


Exercise: 


Problem: What is the median lifetime of these phones (in years)? 


e A0.1941 
e B 1.3863 
e C 2.0794 
e D5.5452 


Solution: 


C 


The next three questions refer to the following information. The Sky 
Train from the terminal to the rental car and long term parking center is 
supposed to arrive every 8 minutes. The waiting times for the train are 
known to follow a uniform distribution. 

Exercise: 


Problem: What is the average waiting time (in minutes)? 


e A 0.0000 
e B 2.0000 
e« C 3.0000 
¢ D 4.0000 


Solution: 


D 


Exercise: 


Problem: Find the 30th percentile for the waiting times (in minutes). 


e A 2.0000 
e B 2.4000 
e C 2.750 
e D 3.000 


Solution: 


B 
Exercise: 


Problem: 


The probability of waiting more than 7 minutes given a person has 
waited more than 4 minutes is? 


e A 0.1250 
e B 0.2500 
e € 0.5000 
e D 0.7500 


Solution: 


B 


Review 
This module provides a number of homework/review problems related to 
Continuous Random Variables. 


[link] — [link] refer to the following study: A recent study of mothers of 
junior high school children in Santa Clara County reported that 76% of the 
mothers are employed in paid positions. Of those mothers who are 
employed, 64% work full-time (over 35 hours per week), and 36% work 
part-time. However, out of all of the mothers in the population, 49% work 
full-time. The population under study is made up of mothers of junior high 
school children in Santa Clara County. 


Let & =employed, Let F' =full-time employment 
Exercise: 


Problem: 
e a Find the percent of all mothers in the population that NOT 
employed. 
e b Find the percent of mothers in the population that are employed 
part-time. 


Solution: 


e a 24% 
e b27% 


Exercise: 
Problem: 
The type of employment is considered to be what type of data? 
Solution: 


Qualitative 


Exercise: 


Problem: 


Find the probability that a randomly selected mother works part-time 
given that she is employed. 


Solution: 


0.36 
Exercise: 


Problem: 


Find the probability that a randomly selected person from the 
population will be employed OR work full-time. 


Solution: 


0.7636 
Exercise: 


Problem: 


Based upon the above information, are being employed AND working 
part-time: 


e a mutually exclusive events? Why or why not? 
e b independent events? Why or why not? 
Solution: 


e aNo, 
e bNo, 


[link] - [link] refer to the following: We randomly pick 10 mothers from 
the above population. We are interested in the number of the mothers that 
are employed. Let X =number of mothers that are employed. 


Exercise: 


Problem: State the distribution for _X. 


Solution: 


B(10,0.76) 


Exercise: 


Problem: Find the probability that at least 6 are employed. 


Solution: 


0.9330 
Exercise: 


Problem: 


We expect the Statistics Discussion Board to have, on average, 14 
questions posted to it per week. We are interested in the number of 
questions posted to it per day. 


e a Define X. 

¢ b What are the values that the random variable may take on? 

¢ c State the distribution for X. 

e d Find the probability that from 10 to 14 (inclusive) questions are 
posted to the Listserv on a randomly picked day. 


Solution: 


e a X = the number of questions posted to the Statistics Listserv 
per day 

ba] 01 23: 

e c X~P(2) 

e dO 


Exercise: 
Problem: 


A person invests $1000 in stock of a company that hopes to go public 
in 1 year. 


e The probability that the person will lose all his money after 1 year 
(i.e. his stock will be worthless) is 35%. 

e The probability that the person’s stock will still have a value of 
$1000 after 1 year (i.e. no profit and no loss) is 60%. 

e The probability that the person’s stock will increase in value by 
$10,000 after 1 year (i.e. will be worth $11,000) is 5%. 


Find the expected PROFIT after 1 year. 


Solution: 


$150 

Exercise: 
Problem: 
Rachel’s piano cost $3000. The average cost for a piano is $4000 with 
a standard deviation of $2500. Becca’s guitar cost $550. The average 
cost for a guitar is $500 with a standard deviation of $200. Matt’s 
drums cost $600. The average cost for drums is $700 with a standard 


deviation of $100. Whose cost was lowest when compared to his or her 
own instrument? Justify your answer. 


Solution: 


Matt 
Exercise: 


Problem: 


For each statement below, explain why each is either true or false. 


e a 25% of the data are at most 5. 

e b There is the same amount of data from 4 —5 as there is from 5 — 
ds 

e c There are no data values of 3. 

e d 50% of the data are 4. 


Solution: 


e a False 
e b True 
e c False 
e d False 


[link] — [link] refer to the following: 64 faculty members were asked the 
number of cars they owned (including spouse and children’s cars). The 
results are given in the following graph: 

relative 


frequency 
045 


0 1 2 3 4 5 6 7 number of cars 


Exercise: 


Problem: Find the approximate number of responses that were “3.” 


Solution: 


16 
Exercise: 


Problem: 


Find the first, second and third quartiles. Use them to construct a box 
plot of the data. 


Solution: 


54 


[link] — [link] refer to the following study done of the Girls soccer team 
“Snow Leopards”: 


Hair Style Hair Color 

blond brown black 
ponytail 3 Z 5 
plain 2 2 1 


Suppose that one girl from the Snow Leopards is randomly selected. 
Exercise: 


Problem: 


Find the probability that the girl has black hair GIVEN that she wears 
a ponytail. 


Solution: 


Exercise: 


Problem: 


Find the probability that the girl wears her hair plain OR has brown 
hair. 
Solution: 


ate 
15 


Exercise: 


Problem: 


Find the probability that the girl has blond hair AND that she wears 
her hair plain. 


Solution: 


2 
15 


Lab: Continuous Distribution 
In this lab exercise, students will compare and contrast empirical data from a random number 
generator with the Uniform Distribution. 


Class Time: 


Names: 


Student Learning Outcomes: 


e The student will compare and contrast empirical data from a random number generator 
with the Uniform Distribution. 


Collect the Data 


Use a random number generator to generate 50 values between 0 and 1 (inclusive). List them 
below. Round the numbers to 4 decimal places or set the calculator MODE to 4 places. 


1. Complete the table: 


2. Calculate the following: 


az = 
bs = 

cist quartile = 
d3rd quartile = 
eMedian = 


oo 0 0 0 


Organize the Data 


1. Construct a histogram of the empirical data. Make 8 bars. 


Relative Frequency 


2. Construct a histogram of the empirical data. Make 5 bars. 


Relative Frequency 


Describe the Data 


1. Describe the shape of each graph. Use 2 — 3 complete sentences. (Keep it simple. Does 
the graph go straight across, does it have a V shape, does it have a hump in the middle or 
at either end, etc.? One way to help you determine a shape, is to roughly draw a smooth 
curve through the top of the bars.) 

2. Describe how changing the number of bars might change the shape. 


Theoretical Distribution 


1. In words, X = 
2. The theoretical distribution of X is X ~ U(0, 1). Use it for this part. 
3. In theory, based upon the distribution X ~ U(0, 1), complete the following. 


O° 


a LL 

bo= 

c 1st quartile = 
d 3rd quartile = 
e median = 


o 0 0 0 


4. Are the empirical values (the data) in the section titled "Collect the Data" close to the 
corresponding theoretical values above? Why or why not? 


Plot the Data 


1. Construct a box plot of the data. Be sure to use a ruler to scale accurately and draw 
straight edges. 

2. Do you notice any potential outliers? If so, which values are they? Either way, 
numerically justify your answer. (Recall that any DATA are less than Q1 — 1.5*IQR or 
more than Q3 + 1.5*IQR are potential outliers. IQR means interquartile range.) 


Compare the Data 


1. For each part below, use a complete sentence to comment on how the value obtained 
from the data compares to the theoretical value you expected from the distribution in the 
section titled "Theoretical Distribution." 


aminimum value: 
bist quartile: 
cmedian: 

dthird quartile: 
emaximum value: 
fwidth of IQR: 
goverall shape: 


oo 00 0 0 0 


2. Based on your comments in the section titled "Collect the Data", how does the box plot 
fit or not fit what you would expect of the distribution in the section titled "Theoretical 
Distribution?" 


Discussion Question 


1. Suppose that the number of values generated was 500, not 50. How would that affect 
what you would expect the empirical data to be and the shape of its graph to look like? 


The Normal Distribution 


Student Learning Outcomes 
By the end of this chapter, the student should be able to: 


e Recognize the normal probability distribution and apply it 
appropriately. 

e Recognize the standard normal probability distribution and apply it 
appropriately. 

e Compare normal probabilities by converting to the standard normal 
distribution. 


Introduction 


The normal, a continuous distribution, is the most important of all the 
distributions. It is widely used and even more widely abused. Its graph is 
bell-shaped. You see the bell curve in almost all disciplines. Some of these 
include psychology, business, economics, the sciences, nursing, and, of 
course, mathematics. Some of your instructors may use the normal 
distribution to help determine your grade. Most IQ scores are normally 
distributed. Often real estate prices fit a normal distribution. The normal 
distribution is extremely important but it cannot be applied to everything in 
the real world. 


In this chapter, you will study the normal distribution, the standard normal, 
and applications associated with them. 


Optional Collaborative Classroom Activity 


Your instructor will record the heights of both men and women in your 
class, separately. Draw histograms of your data. Then draw a smooth curve 
through each histogram. Is each curve somewhat bell-shaped? Do you think 
that if you had recorded 200 data values for men and 200 for women that 
the curves would look bell-shaped? Calculate the mean for each data set. 
Write the means on the x-axis of the appropriate graph below the peak. 


Shade the approximate area that represents the probability that one 
randomly chosen male is taller than 72 inches. Shade the approximate area 
that represents the probability that one randomly chosen female is shorter 
than 60 inches. If the total area under each curve is one, does either 
probability appear to be more than 0.5? 


The normal distribution has two parameters (two numerical descriptive 
measures), the mean (jz) and the standard deviation (a). If X is a quantity to 
be measured that has a normal distribution with mean (jz) and the standard 
deviation (a), we designate this by writing 


NORMAL: X~N(u, 0) 


The probability density function is a rather complicated function. Do not 
memorize it. It is not necessary. 


f(e) =e HON 


The cumulative distribution function is P(X < 2) . It is calculated either 
by a calculator or a computer or it is looked up in a table. Technology has 
made the tables basically obsolete. For that reason, as well as the fact that 
there are various table formats, we are not including table instructions in 
this chapter. See the NOTE in this chapter in Calculation of Probabilities. 


The curve is symmetrical about a vertical line drawn through the mean, u. 
In theory, the mean is the same as the median since the graph is symmetric 
about pt. As the notation indicates, the normal distribution depends only on 
the mean and the standard deviation. Since the area under the curve must 


equal one, a change in the standard deviation, 0, causes a change in the 
shape of the curve; the curve becomes fatter or skinnier depending on o. A 
change in pt causes the graph to shift to the left or right. This means there 
are an infinite number of normal probability distributions. One of special 
interest is called the standard normal distribution. 


Glossary 


Normal Distribution 


A continuous random variable (RV) with pdf 
2 
f(x) = —he~(-w)’/20"| where ps is the mean of the distribution and 
ov 2n 


o is the standard deviation. Notation: X ~ N(y, 0). If w = 0 and 
o = 1, the RV is called the standard normal distribution. 


The Standard Normal Distribution 


The standard normal distribution is a normal distribution of 
standardized values called z-scores. A z-score is measured in units of 
the standard deviation. For example, if the mean of a normal distribution 
is 5 and the standard deviation is 2, the value 11 is 3 standard deviations 
above (or to the right of) the mean. The calculation is: 

Equation: 


The z-score is 3. 


The mean for the standard normal distribution is 0 and the standard 
deviation is 1. The transformation 


z= =* produces the distribution Z~ N(0,1) —. The value x comes 


from a normal distribution with mean p and standard deviation o. 


Glossary 


Standard Normal Distribution 
A continuous random variable (RV) X~N(0,1 ).. When X follows the 
standard normal distribution, it is often noted as Z~N(0,1). 


z-score 
The linear transformation of the form z = =. If this transformation 
is applied to any normal distribution X~N(,c) , the result is the 
standard normal distribution Z~N (0,1). If this transformation is 
applied to any specific value x of the RV with mean p and standard 
deviation o , the result is called the z-score of x. Z-scores allow us to 
compare data that are normally distributed but scaled differently. 


Z-Scores 


If X is a normally distributed random variable and X~N(u, o), then the z- 
score is: 
Equation: 


The z-score tells you how many standard deviations that the value z is 
above (to the right of) or below (to the left of) the mean, jz. Values of z 
that are larger than the mean have positive z-scores and values of x that are 
smaller than the mean have negative z-scores. If x equals the mean, then x 
has a z-score of 0. 


Example: 

Suppose X ~ N(5, 6). This says that X is a normally distributed random 
variable with mean pp = 5 and standard deviation o = 6. Suppose x = 17. 
Then: 

Equation: 


This means that x = 17 is 2 standard deviations (20) above or to the 
right of the mean pt = 5. The standard deviation is o = 6. 

Notice that: 

Equation: 


Do 2 16 —vls, (The pattern is + zo = 2.) 


Now suppose x=1. Then: 
Equation: 


= ie 
= —- = —0.67 (rounded to two decimal places) 
o 


This means that x = 1 is 0.67 standard deviations (- 0.670) below or to 
the left of the mean pi = 5. Notice that: 

5 + (—0.67)(6) is approximately equal to 1 (This has the pattern 
p+ (—0.67)o = 1) 

Summarizing, when z is positive, x is above or to the right of and when 
zis negative, x is to the left of or below p. 


Example: 

Some doctors believe that a person can lose 5 pounds, on the average, in a 
month by reducing his/her fat intake and by exercising consistently. 
Suppose weight loss has a normal distribution. Let X = the amount of 
weight lost (in pounds) by a person in a month. Use a standard deviation of 
2 pounds. X~N(5, 2). Fill in the blanks. 

Exercise: 


Problem: 


Suppose a person lost 10 pounds in a month. The z-score when 
x = 10 pounds is z = 2.5 (verify). This z-score tells you that x = 10 


is standard deviations to the (right or left) of the 
mean (What is the mean?). 
Solution: 


This z-score tells you that x = 10 is 2.5 standard deviations to the 
right of the mean 5. 


Exercise: 
Problem: 
Suppose a person gained 3 pounds (a negative weight loss). Then z = 


. This z-score tells you that x = -3 is standard 
deviations to the (right or left) of the mean. 


Solution: 


= -4. This z-score tells you that x = -3 is 4 standard deviations to 
the left of the mean. 
Suppose the random variables X and Y have the following normal 
distributions: X ~N(5, 6) and Y ~ N(2, 1). Ifx = 17, then z = 2. (This 
was previously shown.) If y = 4, what is z? 
Equation: 


= SS where p}=2 and o=1. 


The z-score for y = 4 is z = 2. This means that 4 is z = 2 standard 
deviations to the right of the mean. Therefore, x = 17 and y = 4 are both 2 
(of their) standard deviations to the right of their respective means. 

The z-score allows us to compare data that are scaled differently. To 
understand the concept, suppose X ~N(5, 6) represents weight gains for 
one group of people who are trying to gain weight in a 6 week period and 
Y ~N(2, 1) measures the same weight gain for a second group of people. 
A negative weight gain would be a weight loss. Since x = 17 and y = 4 
are each 2 standard deviations to the right of their means, they represent 
the same weight gain relative to their means. 


The Empirical Rule 
If X is arandom variable and has a normal distribution with mean p and 
standard deviation o then the Empirical Rule says (See the figure below) 


e About 68.27% of the x values lie between -1o0 and +10 of the mean p 
(within 1 standard deviation of the mean). 

e About 95.45% of the x values lie between -20 and +20 of the mean p 
(within 2 standard deviations of the mean). 

e About 99.73% of the x values lie between -30 and +30 of the mean p 
(within 3 standard deviations of the mean). Notice that almost all the x 
values lie within 3 standard deviations of the mean. 

e The z-scores for +1o and —1o are +1 and -1, respectively. 

e The z-scores for +20 and —2o0 are +2 and -2, respectively. 

e The z-scores for +30 and —3a0 are +3 and -3 respectively. 


xX 
—30-—20-lo wp lo 20 30 


The Empirical Rule is also known as the 68-95-99.7 Rule. 


Example: 


Suppose X has a normal distribution with mean 50 and standard deviation 
6. 


e About 68.27% of the x values lie between -1o = (-1)(6) = -6 and lo = 
(1)(6) = 6 of the mean 50. The values 50 - 6 = 44 and 50 + 6 = 56 are 
within 1 standard deviation of the mean 50. The z-scores are -1 and 
+1 for 44 and 56, respectively. 

e About 95.45% of the x values lie between -20 = (-2)(6) = -12 and 20 
= (2)(6) = 12 of the mean 50. The values 50 - 12 = 38 and 50 + 12 = 
62 are within 2 standard deviations of the mean 50. The z-scores are 
-2 and 2 for 38 and 62, respectively. 

e About 99.73% of the x values lie between -3o0 = (-3)(6) = -18 and 30 
= (3)(6) = 18 of the mean 50. The values 50 - 18 = 32 and 50 + 18 = 
68 are within 3 standard deviations of the mean 50. The z-scores are 
-3 and +3 for 32 and 68, respectively. 


Normal Distribution: Areas to the Left and Right of x 


The arrow in the graph below points to the area to the left of x. This area is 
represented by the probability P X «x . Normal tables, computers, and 
calculators provide or calculate the probability P X 2c. 


P(X <x) 


X 


The area to the rightisthenP X cz Px Be 
Remember, P X « _ Areato the left of the vertical line through z. 


Px PX «_ .Areato the right of the vertical line 
through x 


PX « isthesameasP X 2 andPX_~ gz isthesameas 
PX « for continuous distributions. 


Calculations of Probabilities 


Probabilities are calculated by using technology. There are instructions in 
the chapter for the TI-83+ and TI-84 calculators. 


Note:In the Table of Contents for Collaborative Statistics, entry 15. 
Tables has a link to a table of normal probabilities. Use the probability 
tables if so desired, instead of a calculator. The tables include instructions 
for how to use then. 


Example: 
If the area to the left is 0.0228, then the area to the right is 
1 — 0.0228 = 0.9772. 


Example: 

The final exam scores in a statistics class were normally distributed with a 
mean of 63 and a standard deviation of 5. 

Exercise: 


Problem: 


Find the probability that a randomly selected student scored more than 
65 on the exam. 


Solution: 


Let X = ascore on the final exam. X~N(63, 5), where ps = 63 and 
(5) 


Draw a graph. 


Then, find P(« > 65). 


P(x > 65) = 0.3446 (calculator or computer) 


0.3446 


a 


The probability that one student scores more than 65 is 0.3446. 


63 65 


Using the TI-83+ or the TI-84 calculators, the calculation is as 
follows. Go into 2nd DISTR. 


After pressing 2nd DISTR, press 2:normalcdf. 
The syntax for the instructions are shown below. 


normalcdf(lower value, upper value, mean, standard deviation) For 
this problem: normalcdf(65,1E99,63,5) = 0.3446. You get 1E99 ( = 
10°°) by pressing 1, the EE key (a 2nd key) and then 99. Or, you can 
enter 10/99 instead. The number 10% is way out in the right tail of 
the normal curve. We are calculating the area between 65 and 10°”. In 
some instances, the lower number of the area might be -1E99 ( = 
—10%). The number —10% is way out in the left tail of the normal 
curve. 


Note:The TI probability program calculates a z-score and then the 
probability from the z-score. Before technology, the z-score was 
looked up in a standard normal probability table (because the math 
involved is too cumbersome) to find the probability. In this example, 


a standard normal table with area to the left of the z-score was used. 
You calculate the z-score and look up the area to the left. The 
probability is the area to the right. 


ea S88 — 0.4 . Area to the left is 0.6554. 
(pe Oa) 0 le ood Ur ado 


Exercise: 


Problem: 


Find the probability that a randomly selected student scored less than 
85. 


Solution: 


Draw a graph. 


Then find P(x < 85). Shade the graph. P(a < 85) = 1 (calculator 
or computer) 


The probability that one student scores less than 85 is approximately 1 
(or 100%). 


The TI-instructions and answer are as follows: 


normalcdf(0,85,63,5) = 1 (rounds to 1) 
Exercise: 


Problem: 


Find the 90th percentile (that is, find the score k that has 90 % of the 
scores below k and 10% of the scores above k). 


Solution: 


Find the 90th percentile. For each problem or part of a problem, draw 
a new graph. Draw the x-axis. Shade the area that corresponds to the 
90th percentile. 


Let k = the 90th percentile. & is located on the x-axis. P(x < k) is 
the area to the left of k. The 90th percentile k separates the exam 
scores into those that are the same or lower than & and those that are 
the same or higher. Ninety percent of the test scores are the same or 
lower than & and 10% are the same or higher. k is often called a 
critical value. 


k = 69.4 (calculator or computer) 


P(x < k) = 0.90 


The 90th percentile is 69.4. This means that 90% of the test scores fall 
at or below 69.4 and 10% fall at or above. For the TI-83+ or TI-84 
calculators, use LnvNormin 2nd DISTR. invNorm(area to the left, 
mean, standard deviation) For this problem, invNorm(0.90,63,5) = 
69.4 


Exercise: 


Problem: 


Find the 70th percentile (that is, find the score k such that 70% of 
scores are below k and 30% of the scores are above k). 


Solution: 


Find the 70th percentile. 
Draw a new graph and label it appropriately. k = 65.6 


The 70th percentile is 65.6. This means that 70% of the test scores fall 
at or below 65.5 and 30% fall at or above. 


invNorm(0.70,63,5) = 65.6 


Example: 

A computer is used for office work at home, research, communication, 
personal finances, education, entertainment, social networking and a 
myriad of other things. Suppose that the average number of hours a 
household personal computer is used for entertainment is 2 hours per day. 
Assume the times for entertainment are normally distributed and the 
standard deviation for the times is half an hour. 

Exercise: 


Problem: 


Find the probability that a household personal computer is used 
between 1.8 and 2.75 hours per day. 


Solution: 


Let X = the amount of time (in hours) a household personal computer 
is used for entertainment. x~N(2, 0.5) where « = 2 and o = 0.5. 


Find P(1.8 < x < 2.75). 


The probability for which you are looking is the area between 
Les AMC aN) peel Out 2 id) OOO 


18 5 2.75. Xx 


normalcdf(1.8,2.75,2,0.5) = 0.5886 
The probability that a household personal computer is used between 
1.8 and 2.75 hours per day for entertainment is 0.5886. 
Exercise: 
Problem: 


Find the maximum number of hours per day that the bottom quartile 
of households use a personal computer for entertainment. 


Solution: 


To find the maximum number of hours per day that the bottom 
quartile of households uses a personal computer for entertainment, 
find the 25th percentile, k, where P(x < k) = 0.25. 


k= 1.67 


P(x > k) = 0.75 
P(x <k) = 0.25 ia 


invNorm(0.25,2,.5) = 1.66 


The maximum number of hours per day that the bottom quartile of 
households uses a personal computer for entertainment is 1.66 hours. 


Summary of Formulas 
Formula 
Normal Probability Distribution 


X~N(p,0) 


j = the mean o = the standard deviation 
Formula 
Standard Normal Probability Distribution 


Z~N(0, 1) 
z = a Standardized value (z-score) 


mean = 0 standard deviation = 1 
Formula 
Finding the kth Percentile 


To find the kth percentile when the z-score is known: k = w+ (z)o 
Formula 
z-score 
o—U 
on 
Formula 


Finding the area to the left 


— 


The area to the left: P(X < x) 
Formula 
Finding the area to the right 


The area to the right: P(X > 7) = 1— P(X < 2) 


Practice: The Normal Distribution 


Student Learning Outcomes 


e The student will analyze data following a normal distribution. 


Given 


The life of Sunshine CD players is normally distributed with a mean of 4.1 
years and a standard deviation of 1.3 years. A CD player is guaranteed for 3 
years. We are interested in the length of time a CD player lasts. 


Normal Distribution 
Exercise: 


Problem: Define the Random Variable X in words. X = 


Exercise: 


Problem: X~ 
Exercise: 


Problem: 


Find the probability that a CD player will break down during the 
guarantee period. 


e a Sketch the situation. Label and scale the axes. Shade the region 
corresponding to the probability. 


ebP(0<a< = (Use zero (0) for the 
minimum value of x.) 


Solution: 
¢ b3,0.1979 
Exercise: 
Problem: 
Find the probability that a CD player will last between 2.8 and 6 years. 


e a Sketch the situation. Label and scale the axes. Shade the region 
corresponding to the probability. 


‘bP(._. <a< = 


Solution: 
e b 2.8,6,0.7694 
Exercise: 
Problem: 


Find the 70th percentile of the distribution for the time a CD player 
lasts. 


e a Sketch the situation. Label and scale the axes. Shade the region 
corresponding to the lower 70%. 


___——, Therefore, k = 


© bP(4#<k)= 


Solution: 


¢ b0.70,4.78years 


Homework 
Exercise: 


Problem: 


According to a study done by De Anza students, the height for Asian 
adult males is normally distributed with an average of 66 inches and a 
standard deviation of 2.5 inches. Suppose one Asian adult male is 
randomly chosen. Let X =height of the individual. 


e aX~ ( ; ) 

e b Find the probability that the person is between 65 and 69 
inches. Include a sketch of the graph and write a probability 
statement. 

e¢ c Would you expect to meet many Asian adult males over 72 
inches? Explain why or why not, and justify your answer 
numerically. 

e d The middle 40% of heights fall between what two values? 
Sketch the graph and write the probability statement. 


Solution: 


e a N(66,2.5) 

e b0.5404 

e cNo 

e dBetween 64.7 and 67.3 inches 


Exercise: 


Problem: 


IQ is normally distributed with a mean of 100 and a standard deviation 
of 15. Suppose one individual is randomly chosen. Let X =IQ of an 
individual. 


e a X~ ( : ) 
e b Find the probability that the person has an IQ greater than 120. 
Include a sketch of the graph and write a probability statement. 


¢ c Mensa is an organization whose members have the top 2% of all 
IQs. Find the minimum IQ needed to qualify for the Mensa 
organization. Sketch the graph and write the probability 
statement. 

e d The middle 50% of IQs fall between what two values? Sketch 
the graph and write the probability statement. 


Exercise: 


Problem: 


The percent of fat calories that a person in America consumes each 
day is normally distributed with a mean of about 36 and a standard 
deviation of 10. Suppose that one individual is randomly chosen. Let 
X =percent of fat calories. 


e a X~ ( F ) 

e b Find the probability that the percent of fat calories a person 
consumes is more than 40. Graph the situation. Shade in the area 
to be determined. 

e c Find the maximum number for the lower quarter of percent of 
fat calories. Sketch the graph and write the probability statement. 


Solution: 


¢ a N(36,10) 
° b 0.3446 
© ¢ 29.3 


Exercise: 
Problem: 
Suppose that the distance of fly balls hit to the outfield (in baseball) is 


normally distributed with a mean of 250 feet and a standard deviation 
of 50 feet. 


e alf X = distance in feet for a fly ball, then X~ 
( ) 

¢ b If one fly ball is randomly chosen from this distribution, what is 
the probability that this ball traveled fewer than 220 feet? Sketch 
the graph. Scale the horizontal axis X. Shade the region 
corresponding to the probability. Find the probability. 

e c Find the 80th percentile of the distribution of fly balls. Sketch 
the graph and write the probability statement. 


Exercise: 


Problem: 


In China, 4-year-olds average 3 hours a day unsupervised. Most of the 
unsupervised children live in rural areas, considered safe. Suppose that 
the standard deviation is 1.5 hours and the amount of time spent alone 
is normally distributed. We randomly survey one Chinese 4-year-old 
living in a rural area. We are interested in the amount of time the child 
spends alone per day. (Source: San Jose Mercury News) 


e alin words, define the random variable X.X = 

© bxX~ 

e c Find the probability that the child spends less than 1 hour per 
day unsupervised. Sketch the graph and write the probability 
statement. 

e d What percent of the children spend over 10 hours per day 
unsupervised? 

e e 70% of the children spend at least how long per day 
unsupervised? 


Solution: 


e a the time (in hours) a 4-year-old in China spends unsupervised 
per day 

¢ b N(3,1.5) 

e c 0.0912 

e d0 


e 2.21 hours 


Exercise: 


Problem: 


In the 1992 presidential election, Alaska’s 40 election districts 
averaged 1956.8 votes per district for President Clinton. The standard 
deviation was 572.3. (There are only 40 election districts in Alaska.) 
The distribution of the votes per district for President Clinton was bell- 
shaped. Let X = number of votes for President Clinton for an election 
district. (Source: The World Almanac and Book of Facts) 


a State the approximate distribution of X. X~ 

b Is 1956.8 a population mean or a sample mean? How do you 
know? 

c Find the probability that a randomly selected district had fewer 
than 1600 votes for President Clinton. Sketch the graph and write 
the probability statement. 

d Find the probability that a randomly selected district had 
between 1800 and 2000 votes for President Clinton. 

e Find the third quartile for votes for President Clinton. 


Exercise: 


Problem: 


Suppose that the duration of a particular type of criminal trial is known 
to be normally distributed with a mean of 21 days and a standard 
deviation of 7 days. 


a In words, define the random variable X. X = 

b X~ 

c If one of the trials is randomly chosen, find the probability that 
it lasted at least 24 days. Sketch the graph and write the 
probability statement. 

d 60% of all of these types of trials are completed within how 
many days? 


Solution: 


e aThe duration of a criminal trial 
¢ b N(21,7) 

¢ ¢ 0.3341 

e d 22.77 


Exercise: 


Problem: 


Terri Vogel, an amateur motorcycle racer, averages 129.71 seconds per 
2.5 mile lap (in a 7 lap race) with a standard deviation of 2.28 seconds 
. The distribution of her race times is normally distributed. We are 
interested in one of her randomly selected laps. (Source: log book of 
Terri Vogel) 


e a ln words, define the random variable X.X = 
© bxXe 
e c Find the percent of her laps that are completed in less than 130 
seconds. 
e d The fastest 3% of her laps are under ; 
e e The middle 80% of her laps are from seconds to 
seconds. 


Exercise: 


Problem: 


Thuy Dau, Ngoc Bui, Sam Su, and Lan Voung conducted a survey as 
to how long customers at Lucky claimed to wait in the checkout line 
until their turn. Let _X =time in line. Below are the ordered real data 
(in minutes): 


0.50 4.25 2) 6 720 


1.75 4.25 5.25 6 peas 
4.25 D205 6.25 Ao 
2:25 4.25 ops) 6.25 719 
229 4.5 5.0 6.5 8 
2.5 4.75 5.0 6.5 8.25 
2:15 4.75 D./0 6.5 9.5 
3.20 4.75 9.75 6.75 9.5 
3.75 fs) 6 6.75 O37) 
3.75 fs) 6 6.75 10.75 


a Calculate the sample mean and the sample standard deviation. 
b Construct a histogram. Start the x — axis at —0.375 and make 
bar widths of 2 minutes. 

c Draw a smooth curve through the midpoints of the tops of the 
bars. 

d In words, describe the shape of your histogram and smooth 
curve. 

e Let the sample mean approximate pz and the sample standard 
deviation approximate o. The distribution of X can then be 
approximated by X~ 

f Use the distribution in (e) to calculate the probability that a 
person will wait fewer than 6.1 minutes. 

g Determine the cumulative relative frequency for waiting less 
than 6.1 minutes. 

h Why aren’t the answers to (f) and (g) exactly the same? 

i Why are the answers to (f) and (g) as close as they are? 


e j If only 10 customers were surveyed instead of 50, do you think 
the answers to (f) and (g) would have been closer together or 
farther apart? Explain your conclusion. 


Solution: 


e a The sample mean is 5.51 and the sample standard deviation is 
215 

e e N(5.51,2.15) 

¢ f 0.6081 

e g 0.64 


Exercise: 


Problem: 


Suppose that Ricardo and Anita attend different colleges. Ricardo’s 
GPA is the same as the average GPA at his school. Anita’s GPA is 0.70 
standard deviations above her school average. In complete sentences, 
explain why each of the following statements may be false. 


e a Ricardo’s actual GPA is lower than Anita’s actual GPA. 
e b Ricardo is not passing since his z-score is zero. 
e c Anita is in the 70th percentile of students at her college. 


Exercise: 


Problem: 


Below is a sample of the maximum capacity (maximum number of 
spectators) of sports stadiums. The table does not include horse racing 
or motor racing stadiums. (Source: 
http://en.wikipedia.org/wiki/List_of_stadiums_by_capacity) 


40,000 
49,133 
51,500 
52,692 
59,000 
59,680 
62,872 
66,161 
70,585 


75,025 


40,000 
50,071 
51,900 
53,864 
59,000 
60,000 
64,035 
67,428 
71,594 


76,212 


45,050 
50,096 
52,000 
54,000 
55,000 
60,000 
65,000 
68,349 
72,000 


78,000 


45,500 
50,466 
52,132 
55,000 
55,082 
60,492 
65,050 
68,976 
72,922 


80,000 


46,249 
50,832 
52,200 
59,000 
57,000 
60,580 
65,647 
69,372 
735379 


80,000 


48,134 
51,100 
52,530 
55,000 
58,008 
62,380 
66,000 
70,107 
74,500 


82,300 


e a Calculate the sample mean and the sample standard deviation 


for the maximum capacity of sports stadiums (the data). 


e b Construct a histogram of the data. 
e c Draw a smooth curve through the midpoints of the tops of the 


bars of the histogram. 


e dIn words, describe the shape of your histogram and smooth 


curve. 


e e Let the sample mean approximate p and the sample standard 
deviation approximate o. The distribution of X can then be 


approximated by X~ 


e f Use the distribution in (e) to calculate the probability that the 


maximum capacity of sports stadiums is less than 67,000 


spectators. 
e g Determine the cumulative relative frequency that the maximum 
capacity of sports stadiums is less than 67,000 spectators. Hint: 


Order the data and count the sports stadiums that have a 


maximum capacity less than 67,000. Divide by the total number 
of sports stadiums in the sample. 
e h Why aren’t the answers to (f) and (g) exactly the same? 


Solution: 


e a The sample mean is 60,136.4 and the sample standard deviation 
is 10,468.1. 

e e N(60136.4,10468.1) 

¢ f 0.7440 

e« g 0.7167 


Try These Multiple Choice Questions 


The questions below refer to the following: The patient recovery time 
from a particular surgical procedure is normally distributed with a mean of 
5.3 days and a standard deviation of 2.1 days. 

Exercise: 


Problem: What is the median recovery time? 


© A2.7 
e B53 
e C74 
e D2.1 


Solution: 


B 
Exercise: 


Problem: 


What is the z-score for a patient who takes 10 days to recover? 


Al.5 
B 0.2 
C 2.2 
D 7.3 


Solution: 


C 
Exercise: 


Problem: 
What is the probability of spending more than 2 days in recovery? 


e A 0.0580 
e B 0.8447 
e € 0.0553 
e D 0.9420 


Solution: 


D 


Exercise: 


Problem: The 90th percentile for recovery times is? 


e A8.89 
¢ B7.07 
e C 7.99 
e D432 


Solution: 


@ 


The questions below refer to the following: The length of time to find a 
parking space at 9 A.M. follows a normal distribution with a mean of 5 
minutes and a standard deviation of 2 minutes. 

Exercise: 


Problem: 


Based upon the above information and numerically justified, would 
you be surprised if it took less than 1 minute to find a parking space? 


e A Yes 
e BNo 
e C Unable to determine 


Solution: 


A 
Exercise: 


Problem: 


Find the probability that it takes at least 8 minutes to find a parking 
space. 


e A 0.0001 
¢ B 0.9270 
e C 0.1862 
e D 0.0668 


Solution: 


D 


Exercise: 


Problem: 


Seventy percent of the time, it takes more than how many minutes to 
find a parking space? 


© A1.24 
e B2.41 
e €3.95 
e D6.05 


Solution: 


G 
Exercise: 
Problem: 


If the mean is significantly greater than the standard deviation, which 
of the following statements is true? 


e I The data cannot follow the uniform distribution. 
e II The data cannot follow the exponential distribution.. 
e III The data cannot follow the normal distribution. 


¢ Alonly 

¢ BI only 

¢ C Ill only 

e DI, Ul, and Ill 


Solution: 


B 


Review 


The next two questions refer to: X ~ U(3, 13) 
Exercise: 


Problem: Explain which of the following are false and which are true. 


° a f(z) = 75,3 <2< 13 
e b There is no mode. 


e c The median is less than the mean. 
e dP(x > 10) = P(x < 6) 


Solution: 


e a True 

e b True 

e c False — the median and the mean are the same for this 
symmetric distribution 

e d True 


Exercise: 


Problem: Calculate: 


e a Mean 
e b Median 
e c 65th percentile. 


Solution: 


e a8 


°b8 
°c P(x < k) =0.65 = (k—3) *(4).k=9.5 


Exercise: 


Problem: Which of the following is true for the above box plot? 


e a 25% of the data are at most 5. 

¢ b There is about the same amount of data from 4 — 5 as there is 
from 5— 7. 

e c There are no data values of 3. 

e d 50% of the data are 4. 


Solution: 


e a False — + of the data are at most 5 


e b True — each quartile has 25% of the data 
e c False — that is unknown 
e d False — 50% of the data are 4 or less 


Exercise: 


Problem: 
If P(G | H) = P(G)), then which of the following is correct? 


e AG and H are mutually exclusive events. 

¢ BP(G) = P(A) 

e C Knowing that H has occurred will affect the chance that G will 
happen. 

¢ DG and H are independent events. 


Solution: 


D 


Exercise: 


Problem: 


If P(J) = 0.3, P(K) = 0.6, and J and K are independent events, 
then explain which are correct and which are incorrect. 


© AP 
° BP 
© CP 
« DP 


Jand Kk) =0 
Jor kK) = 0,9 
J or Kk) = 0.72 
J) # PJ | K) 


LN LN NON 


Solution: 


e A False - J and K are independent so they are not mutually 
exclusive which would imply dependency (meaning P(J and K) is 
not 0). 

e B False - see answer C. 

e C True - P(J or K) = P(J) + P(K) - P(J and K) = P(J) + P(K) - 
P(J)P(K) = 0.3 + 0.6 - (0.3)(0.6) = 0.72. Note that P(J and K) = 
P(J)P(K) because J and K are independent. 

¢ D False - J and K are independent so P(J) = P(J|K). 


Exercise: 


Problem: 


On average, 5 students from each high school class get full 
scholarships to 4-year colleges. Assume that most high school classes 
have about 500 students. 


X =the number of students from a high school class that get full 
scholarships to 4-year school. Which of the following is the 
distribution of X? 


e AP(5) 
¢ BB(500,5) 
e CExp(1/5) 


¢ DN(5, (0.01)(0.99)/500) 


Solution: 


A 


Lab 1: Normal Distribution (Lap Times) 
Class Time: 


Names: 


Student Learning Outcome: 


e The student will compare and contrast empirical data and a theoretical distribution 
to determine if Terry Vogel's lap times fit a continuous distribution. 


Directions: 


Round the relative frequencies and probabilities to 4 decimal places. Carry all other 
decimal answers to 2 places. 


Collect the Data 


1. Use the data from Terri Vogel’s Log Book. Use a Stratified Sampling Method by 
Lap (Races 1 — 20) and a random number generator to pick 6 lap times from each 
stratum. Record the lap times below for Laps 2 — 7. 


2. Construct a histogram. Make 5 - 6 intervals. Sketch the graph using a ruler and 
pencil. Scale the axes. 


Frequency 


Lap Time 


3. Calculate the following. 


90 axt= 
O° bs= 


4. Draw a smooth curve through the tops of the bars of the histogram. Use 1 — 2 
complete sentences to describe the general shape of the curve. (Keep it simple. 
Does the graph go straight across, does it have a V-shape, does it have a hump in 
the middle or at either end, etc.?) 


Analyze the Distribution 


Using your sample mean, sample standard deviation, and histogram to help, what was 
the approximate theoretical distribution of the data? 


° X~ 
¢ How does the histogram help you arrive at the approximate distribution? 


Describe the Data 


Use the Data from the section titled "Collect the Data" to complete the following 
statements. 


¢ The IQR goes from to 
° IQR= . (IQR=Q3-Q1) 
e The 15th percentile is: 
e The 85th percentile is: 


e The median is: 


e The empirical probability that a randomly chosen lap time is more than 130 seconds 


e Explain the meaning of the 85th percentile of this data. 


Theoretical Distribution 


Using the theoretical distribution from the section titled "Analyse the Distribution" 
complete the following statements: 


¢ The IQR goes from to 

° IQR= 

e The 15th percentile is: 

e The 85th percentile is: 

e The median is: 

e The probability that a randomly chosen lap time is more than 130 seconds = 
e Explain the meaning of the 85th percentile of this distribution. 


Discussion Questions 


¢ Do the data from the section titled "Collect the Data" give a close approximation to 
the theoretical distibution in the section titled "Analyze the Distribution"? In 
complete sentences and comparing the result in the sections titled "Describe the 
Data" and "Theoretical Distribution", explain why or why not. 


Lab 2: Normal Distribution (Pinkie Length) 
Class Time: 


Names: 


Student Learning Outcomes: 
e The student will compare empirical data and a theoretical distribution 


to determine if data from the experiment follow a continuous 
distribution. 


Collect the Data 
Measure the length of your pinkie finger (in cm.) 


1. Randomly survey 30 adults. Round to the nearest 0.5 cm. 


2. Construct a histogram. Make 5-6 intervals. Sketch the graph using a 
ruler and pencil. Scale the axes. 


Frequency 


Length of Finger 


3. Calculate the Following 


oO a= 
o bs= 


4. Draw a smooth curve through the top of the bars of the histogram. Use 
1-2 complete sentences to describe the general shape of the curve. 
(Keep it simple. Does the graph go straight across, does it have a V- 
shape, does it have a hump in the middle or at either end, etc.?) 


Analyze the Distribution 


Using your sample mean, sample standard deviation, and histogram to help, 
what was the approximate theoretical distribution of the data from the 
section titled "Collect the Data"? 


 <. 


¢ How does the histogram help you arrive at the approximate 
distribution? 


Describe the Data 


Using the data in the section titled "Collect the Data" complete the 
following statements. (Hint: order the data) 


Note:(IQR = Q3 — Q1) 


e IQR = 

e 15th percentile is: 

e 85th percentile is: 

e Median is: 

¢ What is the empirical probability that a randomly chosen pinkie length 
is more than 6.5 cm? 

e Explain the meaning the 85th percentile of this data. 


Theoretical Distribution 


Using the Theoretical Distribution in the section titled "Analyze the 
Distribution" 


¢ IQR= 

e 15th percentile is: 

e 85th percentile is: 

e Median is: 

e What is the theoretical probability that a randomly chosen pinkie 
length is more than 6.5 cm? 

e Explain the meaning of the 85th percentile of this data. 


Discussion Questions 


¢ Do the data from the section entitled "Collect the Data" give a close 
approximation to the theoretical distribution in "Analyze the 
Distribution." In complete sentences and comparing the results in the 
sections titled "Describe the Data" and "Theoretical Distribution", 
explain why or why not. 


Practice Final Exam 1 
This module is a practice final for an associated elementary statistics textbook, Collaborative Statistics. 


Questions 1-2 refer to the following: 


An experiment consists of tossing two 12-sided dice (the numbers 1-12 are printed on the sides of each dice). 


e Let Event A = both dice show an even number 
e Let Event B = both dice show a number more than 8 


Exercise: 


Problem: Events A and B are: 


e AMutually exclusive. 

e BlIndependent. 

e CMutually exclusive and independent. 

e DNeither mutually exclusive nor independent. 


Solution: 


B: Independent. 


Exercise: 


Problem: Find P (A|B) 


Solution: 


4 
C: +5 
Exercise: 

Problem: 


Which of the following are TRUE when we perform a hypothesis test on matched or paired samples? 


e ASample sizes are almost never small. 
¢ BTwo measurements are drawn from the same pair of individuals or objects. 


e CTwo sample means are compared to each other. 
e¢ DAnswer choices B and C are both true. 


Solution: 


B: Two measurements are drawn from the same pair of individuals or objects. 


Questions 4 - 5 refer to the following: 


118 students were asked what type of color their bedrooms were painted: light colors, dark colors or vibrant colors. 
The results were tabulated according to gender. 


Light colors Dark colors Vibrant colors 


Female 20 22 28 

Male 10 30 8 
Exercise: 

Problem: 


Find the probability that a randomly chosen student is male or has a bedroom painted with light colors. 


Solution: 


. 68 
B: 118 


Exercise: 
Problem: 


Find the probability that a randomly chosen student is male given the student’s bedroom is painted with dark 


colors. 


Solution: 
. 30 
D: 52 


Questions 6 — 7 refer to the following: 


We are interested in the number of times a teenager must be reminded to do his/her chores each week. A survey of 
40 mothers was conducted. The table below shows the results of the survey. 


£ P (2) 
0 w 
5 


1 
4 ry 
5 ry 
Exercise: 


Problem: Find the probability that a teenager is reminded 2 times. 


e A8 
e BS 
ce 
e D2 


Solution: 


» & 
B: 4 


Exercise: 
Problem: Find the expected number of times a teenager is reminded to do his/her chores. 


e AlS 
e B2.78 
e C1.0 
e D3.13 


Solution: 


B: 2.78 


Questions 8 — 9 refer to the following: 


On any given day, approximately 37.5% of the cars parked in the De Anza parking structure are parked crookedly. 
(Survey done by Kathy Plum.) We randomly survey 22 cars. We are interested in the number of cars that are 
parked crookedly. 

Exercise: 


Problem: For every 22 cars, how many would you expect to be parked crookedly, on average? 


e A8.25 
e Bll 
e C18 
e D7.5 


Solution: 


A: 8.25 


Exercise: 


Problem: What is the probability that at least 10 of the 22 cars are parked crookedly. 


e A0.1263 
e BO.1607 
e €0.2870 
e DO0.8393 


Solution: 


C: 0.2870 
Exercise: 
Problem: 
Using a sample of 15 Stanford-Binet IQ scores, we wish to conduct a hypothesis test. Our claim is that the 


mean IQ score on the Stanford-Binet IQ test is more than 100. It is known that the standard deviation of all 
Stanford-Binet IQ scores is 15 points. The correct distribution to use for the hypothesis test is: 


e ABinomial 
e BStudent's-t 
e CNormal 

e DUniform 


Solution: 


C: Normal 


Questions 11 — 13 refer to the following: 


De Anza College keeps statistics on the pass rate of students who enroll in math classes. In a sample of 1795 
students enrolled in Math 1A (1st quarter calculus), 1428 passed the course. In a sample of 856 students enrolled in 
Math 1B (2nd quarter calculus), 662 passed. In general, are the pass rates of Math 1A and Math 1B statistically the 
same? Let A = the subscript for Math 1A and B = the subscript for Math 1B. 

Exercise: 


Problem: If you were to conduct an appropriate hypothesis test, the alternate hypothesis would be: 


e AH,: pa = pp 
¢ BH,: pa > pp 
¢ CH: pa = pp 


¢ DH,: pa * pB 


Solution: 
D: Ha: pa * pp 
Exercise: 
Problem: The Type | error is to: 


e Aconclude that the pass rate for Math 1A is the same as the pass rate for Math 1B when, in fact, the pass 
rates are different. 


¢ Bconclude that the pass rate for Math 1A is different than the pass rate for Math 1B when, in fact, the 
pass rates are the same. 

¢ Cconclude that the pass rate for Math 1A is greater than the pass rate for Math 1B when, in fact, the pass 
rate for Math 1A is less than the pass rate for Math 1B. 

¢ Dconclude that the pass rate for Math 1A is the same as the pass rate for Math 1B when, in fact, they are 
the same. 


Solution: 


B: conclude that the pass rate for Math 1A is different than the pass rate for Math 1B when, in fact, the pass 
rates are the same. 


Exercise: 
Problem: The correct decision is to: 
e Areject H, 


e Bnot reject H, 
e CThere is not enough information given to conduct the hypothesis test 


Solution: 
B: not reject H, 


Kia, Alejandra, and Iris are runners on the track teams at three different schools. Their running times, in minutes, 
and the statistics for the track teams at their respective schools, for a one mile run, are given in the table below: 


Running Time School Average Running Time School Standard Deviation 
Kia 4.9 5.2 15 
Alejandra 4.2 4.6 25 
Iris 45 4.9 12 


Exercise: 


Problem: Which student is the BEST when compared to the other runners at her school? 
e AKia 
e BAlejandra 


e Clris 
e Dimpossible to determine 


Solution: 


C: Iris 


Questions 15 — 16 refer to the following: 


The following adult ski sweater prices are from the Gorsuch Ltd. Winter catalog: 


{$212, $292, $278, $199$280, $236} 


Assume the underlying sweater price population is approximately normal. The null hypothesis is that the mean 
price of adult ski sweaters from Gorsuch Ltd. is at least $275. 
Exercise: 


Problem: The correct distribution to use for the hypothesis test is: 


ANormal 
BBinomial 
CStudent's-t 
DExponential 


Solution: 


C: Student's-t 


Exercise: 


Problem: The hypothesis test: 


e Ais two-tailed 
e Bis left-tailed 
e Cis right-tailed 
e Dhas no tails 


Solution: 


B: is left-tailed 
Exercise: 
Problem: 
Sara, a Statistics student, wanted to determine the mean number of books that college professors have in their 


office. She randomly selected 2 buildings on campus and asked each professor in the selected buildings how 
many books are in his/her office. Sara surveyed 25 professors. The type of sampling selected is a: 


e Asimple random sampling 
e Bsystematic sampling 

e Ccluster sampling 

¢ Dstratified sampling 


Solution: 


C: cluster sampling 
Exercise: 
Problem: 


A clothing store would use which measure of the center of data when placing orders for the typical "middle" 
customer? 


e A Mean 
e BMedian 
e CMode 
e DIQR 


Solution: 


B: Median 


Exercise: 


Problem: In a hypothesis test, the p-value is 


e Athe probability that an outcome of the data will happen purely by chance when the null hypothesis is 


true. 


¢ Bcalled the preconceived alpha. 
¢ Ccompared to beta to decide whether to reject or not reject the null hypothesis. 
e DAnswer choices A and B are both true. 


Solution: 


A: the probability that an outcome of the data will happen purely by chance when the null hypothesis is true. 


Questions 20 - 22 refer to the following: 


A community college offers classes 6 days a week: Monday through Saturday. Maria conducted a study of the 

students in her classes to determine how many days per week the students who are in her classes come to campus 
for classes. In each of her 5 classes she randomly selected 10 students and asked them how many days they come 
to campus for classes. Each of her classes are the same size. The results of her survey are summarized in the table 


below. 


Number of Days on 
Campus 


1 


2 


Exercise: 


Frequency 
2 
12 


10 


Relative 
Frequency 


24 


.20 


02 


Cumulative Relative 
Frequency 


98 


Problem: Combined with convenience sampling, what other sampling technique did Maria use? 


e Asimple random 
e Bsystematic 

e Ccluster 

¢ Dstratified 


Solution: 


D: stratified 


Exercise: 


Problem: How many students come to campus for classes 4 days a week? 


e A49 
e B25 
e €30 
e D13 


Solution: 
B: 25 
Exercise: 
Problem: What is the 60th percentile for the this data? 


e A2 
e BB 
e C4 
e DS 


Solution: 
Cc: 4 
The next two questions refer to the following: 


The following data are the results of a random survey of 110 Reservists called to active duty to increase security at 
California airports. 


Number of Dependents Frequency 
0 11 
1 27 
2 33 


3 20 


Number of Dependents Frequency 
4 19 


Exercise: 
Problem: 


Construct a 95% Confidence Interval for the true population mean number of dependents of Reservists called 
to active duty to increase security at California airports. 


 A(1.85, 2.32) 
* B(1.80, 2.36) 
* C(1.97, 2.46) 
* D(1.92, 2.50) 


Solution: 


A: (1.85, 2.32) 


Exercise: 


Problem: The 95% confidence Interval above means: 


e A5% of Confidence Intervals constructed this way will not contain the true population aveage number of 
dependents. 


¢ BWe are 95% confident the true population mean number of dependents falls in the interval. 
e CBoth of the above answer choices are correct. 
¢ DNone of the above. 


Solution: 


C: Both above are correct. 


Exercise: 


Problem: X~U (4, 10). Find the 30th percentile. 


e A0.3000 
¢ BB 

e €5.8 

e DG6.1 


Solution: 


C: 5.8 


Exercise: 


Problem: If X~Exp (0.8), then P (x<p) = 


e A0.3679 
e BO.4727 
e C0.6321 
e Dcannot be determined 


Solution: 


C: 0.6321 
Exercise: 
Problem: 


The lifetime of a computer circuit board is normally distributed with a mean of 2500 hours and a standard 
deviation of 60 hours. What is the probability that a randomly chosen board will last at most 2560 hours? 


e A0.8413 
e BO.1587 
e C0.3461 
e D0.6539 


Solution: 


A: 0.8413 
Exercise: 


Problem: 


A survey of 123 Reservists called to active duty as a result of the September 11, 2001, attacks was conducted 
to determine the proportion that were married. Eighty-six reported being married. Construct a 98% 
confidence interval for the true population proportion of reservists called to active duty that are married. 


+ A(0.6030, 0.7954) 
+ B(0.6181, 0.7802) 
* €(0.5927, 0.8057) 
* D(0.6312, 0.7672) 


Solution: 


A: (0.6030, 0.7954) 
Exercise: 


Problem: 


Winning times in 26 mile marathons run by world class runners average 145 minutes with a standard 
deviation of 14 minutes. A sample of the last 10 marathon winning times is collected. 


Let x = mean winning times for 10 marathons. 


The distribution for z is: 


. AN (145 s,) 


> 10 
° BN (145, 14) 
*. Cto 
¢ Ditio 


Solution: 


. 14 
A: N (145, = ) 
Exercise: 
Problem: 


Suppose that Phi Beta Kappa honors the top 1% of college and university seniors. Assume that grade point 
means (G.P.A.) at a certain college are normally distributed with a 2.5 mean and a standard deviation of 0.5. 
What would be the minimum G.P.A. needed to become a member of Phi Beta Kappa at that college? 


e A3.99 
e B1.34 
e €3.00 
e¢ D3.66 


Solution: 


D: 3.66 


The number of people living on American farms has declined steadily during this century. Here are data on the 
farm population (in millions of persons) from 1935 to 1980. 


Year 1935 1940 1945 1950 1955 1960 1965 1970 1975 198 


Population 32.1 30.5 24.4 23.0 19.1 15.6 12.4 9.7 8.9 7.2 


The linear regression equation is y-hat = 1166.93 — 0.5868x 
Exercise: 


Problem: What was the expected farm population (in millions of persons) for 1980? 


e A7.2 
e B5S.1 
e €6.0 
e D8.0 


Solution: 


B:5.1 


Exercise: 


Problem: In linear regression, which is the best possible SSE? 


e A13.46 
e B18.22 
e €24.05 
e D16.33 


Solution: 


A: 13.46 
Exercise: 
Problem: 


In regression analysis, if the correlation coefficient is close to 1 what can be said about the best fit line? 


e Alt is a horizontal line. Therefore, we can not use it. 

e BThere is a strong linear pattern. Therefore, it is most likely a good model to be used. 

e CThe coefficient correlation is close to the limit. Therefore, it is hard to make a decision. 
¢ DWe do not have the equation. Therefore, we can not say anything about it. 


Solution: 


B: There is a strong linear pattern. Therefore, it is most likely a good model to be used. 


Question 34-36 refer to the following: 


A study of the career plans of young women and men sent questionnaires to all 722 members of the senior class in 
the College of Business Administration at the University of Illinois. One question asked which major within the 
business program the student had chosen. Here are the data from the students who responded. 


Female Male 
Accounting 68 56 
Administration 91 40 
Ecomonics 5 6 
Finance 61 59 


Does the data suggest that there is a relationship between the gender of students and their choice of major? 


Exercise: 


Problem: The distribution for the test is: 


° AChi’, 

° BChi’; 

° Ct721 

° DN (0, 1) 


Solution: 


B: Chi’; 


Exercise: 


Problem: The expected number of female who choose Finance is : 


e A37 
e Bol 
e C60 
e D70 


Solution: 


D: 70 


Exercise: 


Problem: The p-value is 0.0127 and the level of significance is 0.05. The conclusion to the test is: 


e AThere is insufficient evidence to conclude that the choice of major and the gender of the student are not 
independent of each other. 

e BThere is sufficient evidence to conclude that the choice of major and the gender of the student are not 
independent of each other. 

e CThere is sufficient evidence to conclude that students find Economics very hard. 

e DtThere is in sufficient evidence to conclude that more females prefer Administration than males. 


Solution: 
B: There is sufficient evidence to conclude that the choice of major and the gender of the student are not 
independent of each other. 

Exercise: 
Problem: 
An agency reported that the work force nationwide is composed of 10% professional, 10% clerical, 30% 
skilled, 15% service, and 35% semiskilled laborers. A random sample of 100 San Jose residents indicated 15 


professional, 15 clerical, 40 skilled, 10 service, and 20 semiskilled laborers. At a = .10 does the work force in 
San Jose appear to be consistent with the agency report for the nation? Which kind of test is it? 


¢ AChi” goodness of fit 

¢ BChi’ test of independence 

e ClIndependent groups proportions 
e¢ DUnable to determine 


Solution: 


A: Chi? goodness of fit 


Practice Final Exam 2 

This module is a practice final for an associated elementary statistics textbook, 
Collaborative Statistics, available for Fall 2008. 

Exercise: 


Problem: 


A study was done to determine the proportion of teenagers that own a car. 
The population proportion of teenagers that own a car is the 


e Astatistic 
e Bparameter 
¢ Cpopulation 
e Dvariable 


Solution: 


B: parameter 


The next two questions refer to the following data: 


value frequency 
0 1 
i 4 
2 7 
° 9 
6 4 


Exercise: 


Problem: The box plot for the data is: 


e A 


Solution: 


A 
Exercise: 


Problem: 


If 6 were added to each value of the data in the table, the 15th percentile of 
the new list of values is: 


e AG 
e Bl 
e C7 
e D8 


Solution: 


Ce7 


The next two questions refer to the following situation: 


Suppose that the probability of a drought in any independent year is 20%. Out of 
those years in which a drought occurs, the probability of water rationing is 10%. 
However, in any year, the probability of water rationing is 5%. 

Exercise: 


Problem: 
What is the probability of both a drought and water rationing occurring? 


e A0.05 


¢ BO.01 
« €0.02 
¢ DO.30 

Solution: 

GC: 0:02 

Exercise: 

Problem: Which of the following is true? 
e Adrought and water rationing are independent events 
¢ Bdrought and water rationing are mutually exclusive events 
¢ Cnone of the above 


Solution: 


C: none of the above 


The next two questions refer to the following situation: 


Suppose that a survey yielded the following data: 


gender apple pumpkin pecan 
female 40 10 30 
male 20 30 10 


Favorite Pie Type 


Exercise: 


Problem: 


Suppose that one individual is randomly chosen. The probability that the 
person’s favorite pie is apple or the person is male is: 


Solution: 
. 100 
D: 140 


Exercise: 


Problem: Suppose H, is: Favorite pie type and gender are independent. 
The p-value is: 

e AvO0 

« Bl 


e €0.05 
e Dcannot be determined 


Solution: 


A: +0 


The next two questions refer to the following situation: 


Let’s say that the probability that an adult watches the news at least once per 
week is 0.60. We randomly survey 14 people. Of interest is the number that 
watch the news at least once per week. 

Exercise: 


Problem: Which of the following statements is FALSE? 


AX~B(14, 0.60) 

¢ BThe values for x are: { 1, 2, 3,...,14} 
eCu=84 

© DP(X =5) = 0.0408 


Solution: 


B: The values for x are: { 1, 2, 8,..., 14} 


Exercise: 


Problem: Find the probability that at least 6 adults watch the news. 


6 
e AcT 
e B0.8499 
e C0.9417 
e D0.6429 


Solution: 


C: 0.9417 
Exercise: 


Problem: 


The following histogram is most likely to be a result of sampling from 
which distribution? 


e AChi-Square with df = 6 
e¢ BExponential 


e CUniform 
e DBinomial 


Solution: 


D: Binomial 


The ages of campus day and evening students is known to be normally 
distributed. A sample of 6 campus day and evening students reported their ages 
(in years) as: {18, 35, 27, 45, 20, 20} 

Exercise: 


Problem: 


What is the error bound for the 90% confidence interval of the true average 
age? 


e All.2 
e B22.3 
e C175 
e D8.7 


Solution: 


D: 8.7 
Exercise: 
Problem: 


If a normally distributed random variable has p = 0 and o = 1, then 
97.5% of the population values lie above: 


A-1.96 
B1.96 
Cl 
D-1 


Solution: 


A: -1.96 


The next three questions refer to the following situation: 


The amount of money a customer spends in one trip to the supermarket is 
known to have an exponential distribution. Suppose the average amount of 
money a customer spends in one trip to the supermarket is $72. 

Exercise: 


Problem: 


What is the probability that one customer spends less than $72 in one trip 
to the supermarket? 


e A0.6321 
e BO.5000 
e €0.3714 
e D1 


Solution: 


A: 0.6321 
Exercise: 


Problem: 


How much money altogether would you expect next 5 customers to spend 
in one trip to the supermarket (in dollars)? 


e C5184 
e D360 


Solution: 


D: 360 


Exercise: 


Problem: 


If you want to find the probability that the mean of 50 customers is less 
than $60, the distribution to use is: 


° A N (72,72) 


, 2 
BN (72, 2.) 


e¢ CExp(72) 
¢ DExp(+) 


Solution: 
; 72 
B: N(72, 2.) 


The next three questions refer to the following situation: 


The amount of time it takes a fourth grader to carry out the trash is uniformly 
distributed in the interval from 1 to 10 minutes. 
Exercise: 


Problem: 


What is the probability that a randomly chosen fourth grader takes more 
than 7 minutes to take out the trash? 


Solution: 
8 
ar 


Exercise: 


Problem: 
Which graph best shows the probability that a randomly chosen fourth 


grader takes more than 6 minutes to take out the trash given that he/she has 
already taken more than 3 minutes? 


f(x) 


f(x) 


a) 


Solution: 


D 
Exercise: 
Problem: 


We should expect a fourth grader to take how many minutes to take out the 
trash? 


e A4.5 
e B5.5 
e G5 

e D10 


Solution: 


B: 5.5 


The next three questions refer to the following situation: 


At the beginning of the quarter, the amount of time a student waits in line at the 
campus Cafeteria is normally distributed with a mean of 5 minutes and a 
standard deviation of 1.5 minutes. 

Exercise: 


Problem: What is the 90th percentile of waiting times (in minutes)? 


e Al.28 
e B90 

e C7.47 
e D6.92 


Solution: 


D: 6.92 


Exercise: 


Problem: The median waiting time (in minutes) for one student is: 


e AS 

e B50 
© C25 
e D1.5 


Solution: 


A:5 
Exercise: 


Problem: 


Find the probability that the average wait time of 10 students is at most 5.5 
minutes. 


e A0.6301 
e BO.8541 
C0.3694 
e DO.1459 


Solution: 


B: 0.8541 
Exercise: 


Problem: 


A sample of 80 software engineers in Silicon Valley is taken and it is found 
that 20% of them earn approximately $50,000 per year. A point estimate 
for the true proportion of engineers in Silicon Valley who earn $50,000 per 
year is: 


A16 
BO.2 
Cl 
DO0.95 


Solution: 


B: 0.2 


Exercise: 


Problem: If P(Z < zq) = 0. 1587 where Z~N(0, 1) , then a is equal to: 


A-1 
BO0.1587 
C0.8413 
D1 


Solution: 


A: -1 


Exercise: 
Problem: 
A professor tested 35 students to determine their entering skills. At the end 


of the term, after completing the course, the same test was administered to 
the same 35 students to study their improvement. This would be a test of: 


e Aindependent groups 

e B2 proportions 

¢ Cmatched pairs, dependent groups 
e¢ Dexclusive groups 


Solution: 


C: matched pairs, dependent groups 
Exercise: 
Problem: 


A math exam was given to all the third grade children attending ABC 
School. Two random samples of scores were taken. 


n x S 
Boys 55 82 5 
Girls 60 86 y 


Which of the following correctly describes the results of a hypothesis test 
of the claim, “There is a difference between the mean scores obtained by 
third grade girls and boys at the 5 % level of significance”? 


¢ ADo not reject H,. There is insufficient evidence to conclude that 
there is a difference in the mean scores. 

¢ BDo not reject H,. There is sufficient evidence to conclude that there 
is a difference in the mean scores. 

¢ CReject H,. There is insufficient evidence to conclude that there is no 
difference in the mean scores. 

¢ DReject H,. There is sufficient evidence to conclude that there is a 
difference in the mean scores. 


Solution: 


D: Reject Ho. There is sufficient evidence to conclude that there is a 
difference in the mean scores. 


Exercise: 


Problem: 


In a survey of 80 males, 45 had played an organized sport growing up. Of 
the 70 females surveyed, 25 had played an organized sport growing up. We 
are interested in whether the proportion for males is higher than the 
proportion for females. The correct conclusion is: 


e AThere is insufficient information to conclude that the proportion for 
males is the same as the proportion for females. 

¢ BThere is insufficient information to conclude that the proportion for 
males is not the same as the proportion for females. 

e CThere is sufficient evidence to conclude that the proportion for 
males is higher than the proportion for females. 

¢ DNot enough information to determine. 


Solution: 


C: There is sufficient evidence to conclude that the proportion for males is 
higher than the proportion for females. 


Exercise: 


Problem: 


Note: Chi-Square Test of a Single Variance; Not all classes cover this topic. 
From past experience, a Statistics teacher has found that the average score 
on a midterm is 81 with a standard deviation of 5.2. This term, a class of 49 
students had a standard deviation of 5 on the midterm. Do the data indicate 
that we should reject the teacher’s claim that the standard deviation is 5.2? 
Use a = 0.05. 


e AYes 
e BNo 
¢ CNot enough information given to solve the problem 


Solution: 


B: No 
Exercise: 


Problem: 


Note: F Distribution Test of ANOVA; Not all classes cover this topic. 
Three loading machines are being compared. Ten samples were taken for 
each machine. Machine I took an average of 31 minutes to load packages 
with a standard deviation of 2 minutes. Machine II took an average of 28 
minutes to load packages with a standard deviation of 1.5 minutes. 
Machine III took an average of 29 minutes to load packages with a 
standard deviation of 1 minute. Find the p-value when testing that the 
average loading times are the same. 


e Athe p—value is close to 0 
¢ Bp-—value is close to 1 
¢ CNot enough information given to solve the problem 


Solution: 


B: p-value is close to 1. 


The next three questions refer to the following situation: 


A corporation has offices in different parts of the country. It has gathered the 
following information concerning the number of bathrooms and the number of 
employees at seven sites: 


Number 
of 
employees 
x 


650 730 810 900 102 107 1150 


Number 
of 
bathrooms 


a 


40 50 34 61 82 110 121 


Exercise: 


Problem: 


Is the correlation between the number of employees and the number of 
bathrooms significant? 


e AYes 
e BNo 
e CNot enough information to answer question 


Solution: 
B: No 
Exercise: 
Problem: The linear regression equation is: 


« Ay = 0.0094 — 79.96z 


e By = 79.96 + 0.0094z 
« Cy = 79.96 — 0.0094x 
° Dy = —0.0094 + 79.96x 


Solution: 


C: y = 79.96” — 0.0094 
Exercise: 
Problem: 


If a site has 1150 employees, approximately how many bathrooms should it 
have? 


A69 

B91 

C91,954 

DWe should not be estimating here. 


Solution: 


D: We should not be estimating here. 
Exercise: 
Problem: 


Note: Chi-Square Test of a Single Variance; Not all classes cover this topic. 
Suppose that a sample of size 10 was collected, with x = 4.4 ands =1.4. 


H, : o?= 1.6 vs. Hy : 0? # 1.6. Which graph best describes the results of 
the test? 


2 | | b. | | 

— + - 1.96 1.96 2 
c. | | d. | | 

11.03 I? 2 23 2233 t 


Solution: 


A 
Exercise: 
Problem: 


64 backpackers were asked the number of days their latest backpacking trip 
was. The number of days is given in the table below: 


# of days 1 2 3 4 fs) 6 7 8 


Frequency S) 9 6 12 vi 10 fs) 10 


Conduct an appropriate test to determine if the distribution is uniform. 


¢ AThe p-value is > 0.10. There is insufficient information to conclude 
that the distribution is not uniform. 

e BThe p—value is < 0.01. There is sufficient information to conclude 
the distribution is not uniform. 

e CThe p-value is between 0.01 and 0.10, but without alpha (a) there is 
not enough information 

e DThere is no such test that can be conducted. 


Solution: 
A: The p-value is > 0.10. There is insufficient information to conclude 
that the distribution is not uniform. 
Exercise: 
Problem: 
Note: F Distribution test of One-Way ANOVA; Not all classes cover this 


topic. Which of the following statements is true when using one-way 
ANOVA? 


AThe populations from which the samples are selected have different 
distributions. 

e BThe sample sizes are large. 

CThe test is to determine if the different groups have the same means. 
¢ DThere is a correlation between the factors of the experiment. 


Solution: 


C: The test is to determine if the different groups have the same means. 


English Phrases Written Mathematically 
This module provides an overview of commonly used phrases in statistics 
and their mathematical equivalents. 


English Phrases Written Mathematically 


When the English says: Interpret this as: 
Xis at least 4. X>A4 
The minimum of X is 4. X>A4 
X is no less than 4. X>A4 
X is greater than or equal to 4. X>A4 
X is at most 4. X<A4 
The maximum of X is 4. X<4 
Xis no more than 4. X<4 
X is less than or equal to 4. X<A4 
X does not exceed 4. X<A4 


Xis greater than 4. X>A4 


When the English says: Interpret this as: 


X is more than 4. X>A4 
Xexceeds 4. X>A4 
Xis less than 4. X<A4 
There are fewer X than 4. X<A4 
Xis 4. X=A4 
Xis equal to 4. X=4 
Xis the same as 4. » a. 
Xis not 4. XA 
Xis not equal to 4. X #4 
Xis not the same as 4. X#A4 


Xis different than 4. XA 


Symbols and their Meanings 
This module defines symbols used throughout the Collaborative Statistics 
textbook. 


Chapter 
(1st used) Symbol Spoken Meaning 
Sampling The square root 
and Data Vv of os 
Sampling Pj pee : 
Sacha 1 i (a specific 
number) 
Descriptive the first 
Statistics Ql uae oue quartile 
Descriptive the second 
Statistics Q2 Qamewe quartile 
Descriptive : the third 
Statistics Q3 ae tae quartile 
Descriptive IQR inter-quartile Q3- 
Statistics range Q1i=IQR 
Descriptive “ “ba sample 
Statistics mean 
Descriptive P ani population 
Statistics mean 


Chapter 
(1st used) 


Descriptive 
Statistics 


Descriptive 
Statistics 


Descriptive 
Statistics 


Descriptive 
Statistics 


Descriptive 
Statistics 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Symbol 


S Sy SX 


O Ox OX 


{} 


Spoken 


s-squared 


sigma 


sigma-squared 


capital sigma 


brackets 


Event A 


probability of 
A 


Meaning 
sample 
standard 


deviation 


sample 
variance 


population 
standard 


deviation 


population 
variance 


sum 


set notation 


sample 
space 


event A 


probability 
of A 
occurring 


Chapter 
(1st used) 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Probability 
Topics 


Discrete 
Random 
Variables 


Symbol 


P(A| B) 


P (Aor B) 


P (Aand B) 


A? 


PDF 


Spoken 


probability of 
A given B 


prob. of A or B 


prob. of A and 
B 


A-prime, 
complement of 
A 


prob. of 
complement of 
A 


green on first 
pick 


prob. of green 
on first pick 


prob. 
distribution 
function 


Meaning 


prob. of A 
occurring 
given B has 
occurred 


prob. of A 
or B or both 
occurring 
prob. of 
both A and 


B occurring 
(same time) 


complement 
of A, notA 


same 


same 


same 


same 


Chapter 
(1st used) 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Symbol 


IV 


Spoken 


the distribution 


of X 


binomial 


distribution 


geometric 


distribution 


hypergeometric 


dist. 


Poisson dist. 


Lambda 


greater than or 


equal to 


Meaning 


the random 
variable X 


same 


same 


same 


same 


same 


average of 


Poisson 
distribution 


same 


Chapter 
(1st used) 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Discrete 
Random 
Variables 


Continuous 
Random 
Variables 


Continuous 
Random 
Variables 


Continuous 
Random 
Variables 


Continuous 
Random 
Variables 


Continuous 
Random 
Variables 


Symbol 


IA 


Exp 


Spoken 


less than or 
equal to 


equal to 


not equal to 


f of x 


prob. density 
function 


uniform 
distribution 


exponential 
distribution 


Meaning 


same 


same 


same 


function of 
x 


same 


same 


same 


critical 
value 


Chapter 
(1st used) 


Continuous 
Random 
Variables 


Continuous 
Random 
Variables 


The Normal 
Distribution 


The Normal 
Distribution 


The Normal 
Distribution 


The Central 
Limit 
Theorem 


The Central 
Limit 
Theorem 


The Central 
Limit 
Theorem 
The Central 
Limit 
Theorem 


CLT 


Le 


Me 


Spoken 


f of x equals 


normal 
distribution 


Z-SCOre 


standard 
normal dist. 


Central Limit 


Theorem 


X-bar 


mean of X 


mean of X-bar 


Meaning 


same 


decay rate 
(for exp. 
dist.) 


same 


same 


same 


same 


the random 
variable X- 
bar 


the average 
of X 


the average 
of X-bar 


Chapter 
(1st used) 


The Central 
Limit 
Theorem 


The Central 
Limit 
Theorem 


The Central 
Limit 
Theorem 
The Central 
Limit 
Theorem 


Confidence 
Intervals 


Confidence 
Intervals 


Confidence 
Intervals 


Confidence 
Intervals 


Confidence 
Intervals 


Confidence 
Intervals 


Symbol 


Or 


Or 


dX 


CL 


CI 


EKBM 


df 


Spoken 


standard 
deviation of X 


standard 
deviation of X- 
bar 


sum of X 


sum of x 
confidence 
level 


confidence 
interval 


error bound for 
a mean 


error bound for 
a proportion 


student-t 
distribution 


degrees of 
freedom 


Meaning 


same 


same 


same 


same 


same 


same 


same 


same 


same 


same 


Chapter 
(1st used) 


Confidence 
Intervals 


Confidence 
Intervals 


Confidence 
Intervals 


Hypothesis 
Testing 


Hypothesis 
Testing 


Hypothesis 
Testing 


Hypothesis 
Testing 


Hypothesis 
Testing 


Hypothesis 
Testing 


Ay, 


X1—X2 


Spoken 


student-t with 
a/2 area in 
right tail 


p-prime; p-hat 


q-prime; q-hat 


H-naught, H- 
sub 0 


H-a, H-sub a 


H-1, H-sub 1 


alpha 


beta 


X1-bar minus 
X2-bar 


Meaning 


same 


sample 
proportion 
of success 


sample 
proportion 
of failure 


null 
hypothesis 


alternate 
hypothesis 


alternate 
hypothesis 


probability 
of Type I 
error 


probability 
of Type II 
error 


difference 
in sample 
means 


Chapter 
(1st used) 


Chi-Square 
Distribution 


Linear 
Regression 
and 
Correlation 


Symbol 

1 — P2 
P’,—P” 
P1 — P2 
x2 

O 

E 
y=a+bx 
y 

r 


Spoken 


mu-1 minus 
mu-2 


P1-prime 
minus P2- 
prime 


pl minus p2 


Ky-square 


Observed 


Expected 


y equals a plus 
b-x 


y-hat 


correlation 
coefficient 


Meaning 


difference 
in 
population 
means 


difference 
in sample 
proportions 


difference 
in 


population 
proportions 


Chi-square 


Observed 
frequency 


Expected 
frequency 


equation of 
a line 


estimated 
value of y 


same 


Chapter 


(1st used) Symbol 
E 
SSE 
1.9s 

F- 

Distribution F 

and 

ANOVA 


Symbols and their Meanings 


Spoken 
error 


Sum of 
Squared Errors 


1.9 times s 


F-ratio 


Meaning 


same 


same 


cut-off 
value for 
outliers 


F ratio 


Formulas 

This module provides an overview of Statistics Formulas used as a part of 
Collaborative Statistics collection (col10522) by Barbara Illowsky and 
Susan Dean. 

Formula 

Factorial 


n! = n(n — 1)(n — 2)... (1) 


O!=1 
Formula 
Combinations 


=e 


Formula 
Binomial Distribution 


X~B(n,p) 


PI XB! pg © Mone 20,4 2 on05 
Formula 
Geometric Distribution 


X~-G(p) 


21 0, Ge x) = 9° p fore = 1.9. 3.33 
Formula 
Hypergeometric Distribution 


X~H (r,b,n) 
a) 
y cdi WO Gens se (te ; 
(=*) = ("5 
Formula 


Poisson Distribution 


X~P (u) 


P(X=2)=4> 


z! 
Formula 
Uniform Distribution 


X~U(a, b) 


f(X) =po,a<2<b 
Formula 


Exponential Distribution 
X ~ Exp(m) 


{= me" ame Oe 0 
Formula 
Normal Distribution 


X~N(p, 07) 


1 cea? 
fix) =——e ww , -w<2< 0 
oV or 


Formula 
Gamma Function 


DZ \af, ee dee ee 

r(4) = va 

I'(m +1) = m! for m, a nonnegative integer 
otherwise: (a + 1) = aI'(a) 

Formula 


Student-t Distribution 


X ~tae¢ 


Z-N(0,1) , Y-X2, ,n = degrees of freedom 
Formula 
Chi-Square Distribution 


X~X?2, 


n-2 —-2 
Vi () = Era) ,x > 0, n= positive integer and degrees of freedom 
2 


Formula 
F Distribution 


X ~Fat(n),df(d) 
df(n) =degrees of freedom for the numerator 


df(d) =degrees of freedom for the denominator 


Y, 
Pe — We > Y, W are chi-square 


Some Mathmatical Formulas 
A variety of mathematical aids to probability analysis and calculations. 


Series 
¢ 1. Geometric series From the expression (1 — r) (l+r+r?+...4+r") =1—r"'!, we 
obtain 
Equation: 
nr L= n+1 
a re = eo forr~l 
ram l= 
Be ate k 1 
For |r| < 1, these sums converge to the geometric series ys i 
no LT 
Differentiation yields the following two useful series: 
Equation: 
a 1 = 2 
De ae = —— for |r|<1 and So k(k —1)r*-? = —~ for |r| < 1 
g 
k— 1— r) k=2 (1 _ r) 
For the finite sum, differentiation and algebraic manipulation yields 
Equation: 
n 1—-r"fl+n(l-r 1 
De kr’) — scree ord cae) which converges to ——— for |r| <1 
k=0 (Ln) (1-1) 
ore) oh ore) : ok 
° ° K+ woe, Souee ree’ i oe =e wie 
¢ 2. Exponential series. e” = ps rm and e * = 3 (—1) ‘ for any x 
Simple algebraic manipulation yields the following equalities useful for the Poisson 
distribution: 
Equation: 
oo oo ie) ak 
= = 230 
Sens SH and Se ae? 
k=n k=n-1 k=n k=n—-2 
. (n+ 1) n n(n + 1)(2n + 1) 
e 3. Sums of powers of integers i= v= 
we 


Some useful integrals 


¢ 1. The gamma functionI' (r) = Ag ‘dt forr>0 


Integration by parts shows I(r) = (r—1)I'(r—1) forr>1 


By induction P(r) = (r —1)(r —2)---(r-—k)E'(r—k) forr>k 
For a positive integer n, I'(n) = (n—1)! with F(1) =0!=1 

e 2. By a change of variable in the gamma integral, we obtain 
Equation: 


/ ite ae — Et) Peat ASO 
0 N+ 


e 3. A well known indefinite integral gives 
Equation: 


00 1 a 1 
/ te dt = 50 e4 (1+ a) and i Pee at = eB ee? E + Aa + (ra)?/2| 


For any positive integer m, 
Equation: 


1+Aa+ 


oo ! 
m__—At = Mm. —ra 
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¢ 4. The following integrals are important for the Beta distribution. 
Equation: 
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1 
For nonnegative integers m,n u™(1 —u)" du = ————__ 
7 6 [ ( ) (m+n-+1)! 


Some basic counting problems 


We consider three basic counting problems, which are used repeatedly as components of more 
complex problems. The first two, arrangements and occupancy are equivalent. The third is a 
basic matching problem. 


I. Arrangements of r objects selected from among n distinguishable objects. 


a. The order is significant. 
b. The order is irrelevant. 


For each of these, we consider two additional alternative conditions. 


1. No element may be selected more than once. 
2. Repitition is allowed. 


II. Occupancy of n distinct cells by r objects. These objects are 


a. Distinguishable. 


b. Indistinguishable. 
The occupancy may be 


1. Exclusive. 
2. Nonexclusive (i.e., more than one object per cell) 


The results in the four cases may be summarized as follows: 


a. 1. Ordered arrangements, without repetition (permutations). Distinguishable objects, 
exclusive occupancy. 
Equation: 


n! 
P(n,r) = ——— 
(n,7) (n—r)! 
2. Ordered arrangements, with repitition allowed. Distinguishable objects, 
nonexclusive occupancy. 
Equation: 


Ce) = 


b. 1. Arrangements without repetition, order irrelevant (combinations). Indistinguishable 
objects, exclusive occupancy. 
Equation: 


Cn) = eet AD) 


r\(n—r)! r! 


2. Unordered arrangements, with repetition. Indistinguishable objects, nonexclusive 
occupancy. 
Equation: 


S(n,r) = C(n+r-—1,r) 


IIL. Matchingn distinguishable elements to a fixed order. Let M(n, k) be the number of 
permutations which give k matches. 


Example: 

m= 

Natural order 12345 

Permutation 3 2 5 4 1 (Two matches— positions 2, 4) 


We reduce the problem to determining m(n, 0), as follows: 


1. Select k places for matches in C'(n, k) ways. 
2. Order the n — k remaining elements so that no matches in the other n — k places. 
Equation: 


M(n,k) = C(n,k)M(n — k, 0) 


Some algebraic trickery shows that M(n, 0) is the integer nearest n! /e. These are easily 
calculated by the MATLAB command M = round(gamma(n+1)/exp(1) ) For example 
>> M = round(gamma([3:10]+1)/exp(1)); >> 

Gisela oe M(t 4 7 tO M(S 8) leo 27 1854 40 8) 1463305) 44 9 
133496 6 265 10 1334961 


Extended binomial coefficients and the binomial series 


¢ The ordinary binomial coefficient is C (n, k) = ; for integersn > 0, O< k<n 


kI(n — k) 
For any real x, any integer k, we extend the definition by 
Equation: 


C(x,0)=1, C(z,k)=0 fork <0, and C(n,k) =0 for a positive integer k > n 


and 
Equation: 


a(x —1)(x —2)---(a2-—k+1) 
k! 
Then Pascal's relation holds: C(x, k) = C(x —1,k —1)+ C(a# —1,k) 


The power series expansion about t = 0 shows 
Equation: 


C(2,k) = otherwise 


(Lt =14C@,Di+Cla, 2 ++ Vo, —bet<i 


For z = n, a positive integer, the series becomes a polynomial of degree n. 


Cauchy's equation 
1. Let f be a real-valued function defined on (0, co), such that 


a. f(t +u) = f(t)+ f(u) fort, u > 0, and 
b. There is an open interval I on which fis bounded above (or is bounded below). 


Then f(¢) = f(1)t Vt >0 
2. Let f be a real-valued function defined on (0, oo) such that 


a. f(tt+u)=f(t)f(u) Vt, u > 0, and 
b. There is an interval on which f is bounded above. 


Then, either f(t) = 0 for t > 0, or there is a constant a such that f (t) =e” fort >0 


[For a proof, see Billingsley, Probability and Measure, second edition, appendix A20] 


Countable and uncountable sets 


A set (or class) is countable iff either it is finite or its members can be put into a one-to-one 
correspondence with the natural numbers. 
Examples 


e The set of odd integers is countable. 

¢ The finite set {n : 1 < n < 1000} is countable. 

¢ The set of all rational numbers is countable. (This is established by an argument known as 
diagonalization). 

e The set of pairs of elements from two countable sets is countable. 

e The union of a countable class of countable sets is countable. 


A set is uncountable iff it is neither finite nor can be put into a one-to-one correspondence with the 
natural numbers. 
Examples 


e The class of positive real numbers is uncountable. A well known operation shows that the 
assumption of countability leads to a contradiction. 

e The set of real numbers in any finite interval is uncountable, since these can be put into a one- 
to-one correspondence of the class of all positive reals. 


Tables 


Note:When you are finished with the table link, use the back button on 
your browser to return here. 


Tables (NIST/SEMATECH e-Handbook of Statistical Methods, 
http://www. itl nist.gov/div898/handbook/, January 3, 2009) 


Student-t table 

Normal table 

Chi-Square table 

F-table 

All four tables can be accessed by going to 

http://www.itl nist. gov/div898/handbook/eda/section3/eda367.htm 


95% Critical Values of the Sample Correlation Coefficient Table 


95% Critical Values of the Sample Correlation Coefficient 


Note:The url for this table is http://cnx.org/content/m17098/latest/ 


